Closed ofleitas closed 2 months ago
Hi @ofleitas,
Thanks for reaching out!
It seems like there is something in your mzIdentML file that cannot be read. It might be corrupt in some way. The program crashes while reading line 35 of the file. To investigate, you can open the mzIdentML file in any text reader (Notepad, Notepad++, VS Code), as long as they are not too large.
If you can and want, you can also send us the file. I'd be happy to take a look.
Best, Ralf
Hello RalfG
I solved the problem associated with line 35, it seems it was because of a special character. But now I am getting this error :
Adding DeepLC-derived features to PSMs.
Running DeepLC for PSMs from run (1/1): 20220322_ID_6552
...
Multiple modifications per site not supported in Peptide Record format.
Traceback (most recent call last):
File "ms2rescore\gui\function2ctk.py", line 301, in run
self.fn(*self.fn_args, **self.fn_kwargs)
File "ms2rescore\gui\app.py", line 637, in function
rescore(configuration=config)
File "ms2rescore\core.py", line 76, in rescore
fgen.add_features(psm_list)
File "ms2rescore\feature_generators\deeplc.py", line 163, in add_features
seq_df=self._psm_list_to_deeplc_peprec(psm_list_calibration)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "ms2rescore\feature_generators\deeplc.py", line 210, in _psm_list_to_deeplc_peprec
peprec = peptide_record.to_dataframe(psm_list)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "psm_utils\io\peptide_record.py", line 505, in to_dataframe
return pd.DataFrame([PeptideRecordWriter._psm_to_entry(psm) for psm in psm_list])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "psm_utils\io\peptide_record.py", line 505, in
Hi @ofleitas,
Regarding the first issue: An encoding problem, most likely. For future reference, a possible fix could be to open the mzIdentML file in an editor such as Windows Notepad and saving it again with "UTF-8" encoding specified.
For the Multiple modifications per site
error: I believe this issue was fixed in one of the latest releases. Could you check if the problem persists with the latest release?
Best, Ralf
I installed the last release and it was solved the multiple modifications per site error. However, now I am getting this error:
Error occurred: index -3 is out of bounds for axis 0 with size 1
Glad the second issue was also solved by updating.
Can you provide some more information on the error? Could you paste the full log? Thanks!
Hello Ralf
Follow the information
Reading PSMs from PSM file (1/1):
'C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/PeptideShaker/Sericopelma_sp.mzid'...
Removed 0 PSMs with rank >= 10.
Found 11699 PSMs, of which 15.46% are decoys.
Non-mapped modifications found: {'Carbamidomethyl', 'Deamidated',
'Oxidation'}
This can be ignored if they are Unimod modification labels.
Found 8383 identified PSMs with rank <= 1 at 0.01 FDR before rescoring.
Adding basic features to PSMs.
Adding MS²PIP-derived features to PSMs.
Running MS²PIP for PSMs from run (1/1) `20220322_ID_6556`...
Processing spectra and peptides...
Adding DeepLC-derived features to PSMs.
Running DeepLC for PSMs from run (1/1): `20220322_ID_6556`...
Percolator output:
Percolator version 3.07.1, Build Date Jun 20 2024 13:21:08
Copyright (c) 2006-9 University of Washington. All rights reserved.
Written by Lukas Käll ***@***.***) in the
Department of Genome Sciences at the University of Washington.
Issued command:
percolator --results-psms
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.percolator.psms.pout
--decoy-results-psms
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.percolator.decoy.psms.pout
--results-peptides
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.percolator.peptides.pout
--decoy-results-peptides
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.percolator.decoy.peptides.pout
--results-proteins
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.percolator.proteins.pout
--decoy-results-proteins
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.percolator.decoy.proteins.pout
--weights
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.percolator.weights.tsv
--verbose 1 --num-threads 16 --post-processing-tdc
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.pin
Started Wed Aug 7 17:36:10 2024
Hyperparameters: selectionFdr=0.01, Cpos=0, Cneg=0, maxNiter=10
Finding protein decoy prefix for
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.pin
Using protein decoy prefix ""
Concatenated search input detected and --post-processing-tdc flag set.
Applying target-decoy competition on Percolator scores.
Selecting Cpos by cross-validation.
Selecting Cneg by cross-validation.
Found 8383 test set positives with q<0.01 in initial direction
---Training with Cpos selected by cross validation, Cneg selected by cross
validation, initial_fdr=0.01, fdr=0.01
Found 8970 test set PSMs with q<0.01.
Selected best-scoring PSM per file+scan+expMass (target-decoy competition):
9890 target PSMs and 1809 decoy PSMs.
Tossing out "redundant" PSMs keeping only the best scoring PSM for each
unique peptide.
Calculating q values.
Final list yields 1845 target peptides with q<0.01.
Calculating posterior error probabilities (PEPs).
Removed 0 PSMs with rank >= 1.
Using 0 features:
Found 11699 PSMs.
- 9890 target PSMs and 1809 decoy PSMs detected.
Assigning confidence...
Performing target-decoy competition...
Keeping the best match per index columns...
- Found 11699 PSMs from unique spectra.
- Found 2421 unique peptides.
Assiging q-values to PSMs...
- Found 8970 PSMs with q<=0.01
Assiging PEPs to PSMs...
Assiging q-values to peptides...
- Found 1846 peptides with q<=0.01
Assiging PEPs to peptides...
Identified 587 (7.00%) more PSMs with rank <= 1 at 0.01 FDR after rescoring.
Writing output to
C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.psms.tsv...
❌ feature weights:
'C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.mokapot.weights.tsv'
❌ log:
'C:/Users/ofm83/OneDrive/Documents/Sericopelma_sp/ms2rescore/ms2rescore.log.txt'
Using 0 features:
Found 11699 PSMs.
- 9890 target PSMs and 1809 decoy PSMs detected.
Parsing FASTA files and digesting proteins...
- Parsed and digested 18880 proteins.
- 15 had no peptides.
- Retained 18865 proteins.
Matching target to decoy proteins...
Building protein groups...
- Aggregated 18865 proteins into 7621 protein groups.
No decoy sequences were found in the FASTA file.
- Creating decoy protein groups that mirror the target proteins.
Discarding shared peptides...
- Discarded 67490 peptides and 58 proteins groups.
- Retained 363859 peptides from 7563 protein groups.
Assigning confidence...
Performing target-decoy competition...
Keeping the best match per index columns...
- Found 11699 PSMs from unique spectra.
- Found 2421 unique peptides.
Mapping decoy peptides to protein groups...
92 out of 2421 peptides could not be mapped. Please check your digest
settings.
- Found 1016 unique protein groups.
Assiging q-values to PSMs...
- Found 8383 PSMs with q<=0.01
Assiging PEPs to PSMs...
index -3 is out of bounds for axis 0 with size 1
Traceback (most recent call last):
File "ms2rescore\gui\function2ctk.py", line 301, in run
self.fn(*self.fn_args, **self.fn_kwargs)
File "ms2rescore\gui\app.py", line 664, in function
rescore(configuration=config)
File "ms2rescore\core.py", line 169, in rescore
generate.generate_report(
File "ms2rescore\report\generate.py", line 87, in generate_report
confidence_before, confidence_after =
get_confidence_estimates(psm_list, fasta_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "ms2rescore\report\utils.py", line 72, in get_confidence_estimates
confidence[when] = lin_psm_dataset.assign_confidence(scores=scores)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "mokapot\dataset.py", line 607, in assign_confidence
return LinearConfidence(
^^^^^^^^^^^^^^^^^
File "mokapot\confidence.py", line 375, in __init__
self._assign_confidence(desc=desc)
File "mokapot\confidence.py", line 476, in _assign_confidence
_, pep = qvality.getQvaluesFromScores(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "triqler\qvality.py", line 80, in getQvaluesFromScores
File "triqler\qvality.py", line 334, in splineEval
IndexError: index -3 is out of bounds for axis 0 with size 1
Hi @ofleitas,
Thanks for sharing the log. It seems that the issue occurs when calculating PEP values with qvality (through Triqler, through Mokapot). Although, I have not seen this problem before. If you are at liberty to share the input files that lead to this error, that would be very helpful. If I'm not mistaken, a *ms2rescore.psms.tsv
file was already written before the error occurred? This file should suffice to help me understand the problem.
Thanks!
The IndexError
occurred due to input scores (before rescoring) that were all either 0 or 100 (PeptideShaker scores on this specific sample), from which PEPs cannot be calculated. The issue is addressed in #182 by catching the error and logging a descriptive warning. This fix will be part of the v3.1.2 release.
Hello
I am trying to run ms2rescore but get the following error :
What can I do?