compomics / ms2rescore

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications
https://ms2rescore.readthedocs.io
Apache License 2.0
39 stars 14 forks source link

MaxQuant can have long-format modification notations; does not work in current implementation #41

Closed NoeGuilloy closed 2 years ago

NoeGuilloy commented 3 years ago

Hi,

I am running ms2pip through ms2resocre, and i get that following Index error.

Merging results multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/noeguill/anaconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "/home/noeguill/anaconda3/lib/python3.7/site-packages/ms2pip/ms2pipC.py", line 316, in process_spectra modpeptide = apply_mods(peptide, mods, PTMmap) File "/home/noeguill/anaconda3/lib/python3.7/site-packages/ms2pip/ms2pipC.py", line 631, in apply_mods modpeptide[pos] = mod IndexError: index 23 is out of bounds for axis 0 with size 19 """

Any ideas ?

Thanks a lot, Noé

NoeGuilloy commented 3 years ago

Hi, It looks like the error is coming from ms2rescore. I am using the ms2rescore maxquant pipeline.

I tried to run it again with a different sample and I get the same error but this time with Index 43 is out of bound.

With a grep ‘43|’ on the msms.peprec I saw that strange line as the C is not in 43 and there is only 21 AA in that sequence. AML_P07_G1_F1.15742.15742 SMMQDREDQSILCTGESGAGK 43|Carbamidomethyl 3 ENSP00000216181.10 66.874 43.867 1

I looked at the same sequence in the msms.txt file SM(Oxidation (M))M(Oxidation (M))QDREDQSILCTGESGAGK

Here if you count the variable mods the C become in position 43 as in the peprec file. It seems that if there is a fixed modification after a variable modification, then the fixed one is not as the right place in the prerec.

Is there something I do wrong? or do you know any fixes?

Thanks, Noé

RalfG commented 3 years ago

Hi Noé,

Thanks for using MS²PIP and MS²ReScore! It could be that this error is the result in a bug which will be fixed in an upcoming release. The bug was introduced in v3.6.3, so could you try v3.6.2 instead and check whether this fixes the issue?

Thanks! Ralf

NoeGuilloy commented 3 years ago

Hi,

I couldn't use v3.6.2 as it wasn't working with ms2rescore. However it seem that I the version of MaxQuant (1.6.14) i use don't have the same 2 letters code for PTMs. I tried to change it in the config-file of ms2rescore didn't work. Changing the annotation by replacing the new code by the 2 letters code in the msms.txt worked.

Also if I wanted to use only the Ms2pip pipeline + percolator I was forced to add the searchengine pipleine on the config fille too, don't know if it's intended.

Beside that it works very well and gave me a lot of identifications while using a large database.

Thanks!, Noé

RalfG commented 2 years ago

Hi Noé,

We have just released MS²PIP v3.8.0 and MS²Rescore v2.0.0. Both include numerous fixes.

Also if I wanted to use only the Ms2pip pipeline + percolator I was forced to add the searchengine pipleine on the config fille too, don't know if it's intended.

In MS²Rescore v2.0.0 you can fully customize which feature sets to use. I would highly recommend to use all available feature sets (["searchengine", "ms2pip", "rt"]), as all sets are complementary to each other and each can give its own boost to Percolator's sensitivity.

With regard of the modification label issue, are the long-format labels (e.g. Oxidation (M)) a custom setting in MaxQuant, or is the default in a newer version? If the latter, which version of MaxQuant did you use?

Best, Ralf

RalfG commented 2 years ago

Hi @NoeGuilloy,

As of v2.1.1, we now support the long-format modification labels in MaxQuant msms.txt files. Let us know if you would experience other issues.

Best, Ralf