As proteins are separated by tabs at the end of each row in the triqler input file, empty columns are considered as extra proteins and, thereby, the peptide is considered shared and is discarded. This results in the following error:
Parsing triqler input file
Calculating identification PEPs
featureClusterIdx: 0
featureClusterIdx: 10000
Dividing intensities by 100000 for increased readability
Surviving spectrumIdxs: 12452
Converting to peptide quant rows
Calculating peptide-level identification PEPs
Writing peptide quant rows to file
Fitting hyperparameters
Traceback (most recent call last):
File "/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/anaconda2/lib/python2.7/site-packages/triqler/__main__.py", line 8, in <module>
main()
File "/anaconda2/lib/python2.7/site-packages/triqler/triqler.py", line 36, in main
runTriqler(params, args.in_file, args.out_file)
File "/anaconda2/lib/python2.7/site-packages/triqler/triqler.py", line 104, in runTriqler
diff_exp.doDiffExp(params, peptQuantRows, triqlerOutputFile, getPickedProteinCalibration, selectComparisonBayesTmp, qvalMethod = qvalMethod)
File "/anaconda2/lib/python2.7/site-packages/triqler/diff_exp.py", line 17, in doDiffExp
proteinOutputRows = proteinQuantificationMethod(peptQuantRows, params, proteinModifier, getEvalFeatures)
File "/anaconda2/lib/python2.7/site-packages/triqler/triqler.py", line 339, in getPickedProteinCalibration
hyperparameters.fitPriors(peptQuantRows, params) # updates priors
File "/anaconda2/lib/python2.7/site-packages/triqler/hyperparameters.py", line 57, in fitPriors
fitLogitNormal(observedXICValues, params, plot)
File "/anaconda2/lib/python2.7/site-packages/triqler/hyperparameters.py", line 84, in fitLogitNormal
vals, bins = np.histogram(observedValues, bins = np.arange(minBin, maxBin, 0.1), normed = True)
ValueError: arange: cannot compute length
This is a problem if the file is saved as .tsv by e.g. Excel, which would pad each row with extra empty columns to match the longest row.
To solve this, we should simply only add non-empty proteins to the peptide during parsing. Furthermore, we should also display a better error message in case no proteotypic peptides are found, together with a suggestion that this could be due to shared peptides.
As proteins are separated by tabs at the end of each row in the triqler input file, empty columns are considered as extra proteins and, thereby, the peptide is considered shared and is discarded. This results in the following error:
This is a problem if the file is saved as .tsv by e.g. Excel, which would pad each row with extra empty columns to match the longest row.
To solve this, we should simply only add non-empty proteins to the peptide during parsing. Furthermore, we should also display a better error message in case no proteotypic peptides are found, together with a suggestion that this could be due to shared peptides.