kusterlab / prosit

Prosit offers high quality MS2 predicted spectra for any organism and protease as well as iRT prediction. When using Prosit is helpful for your research, please cite "Gessulat, Schmidt et al. 2019" DOI 10.1038/s41592-019-0426-7
https://www.proteomicsdb.org/prosit/
Apache License 2.0
85 stars 45 forks source link

Why do some entries have a paranthesis after the fragment charge state? #39

Closed gsaxena888 closed 4 years ago

gsaxena888 commented 4 years ago

In some entries, I see a paranthes after what I believe is the fragment charge number, eg y2^2) whereas in other entries I don't see a paranthes, eg b10^2/. Could you clarify what the paranthesis means and why it appears in some entries but not others? The full example of a predicted spectra from which the above two situations has been pulled from is shown below:

Name: AAEGADTTGATPK/3
MW: 397.19469168128995
Comment: Parent=397.19469168128995 Collision_energy=35.0 Mods=0 ModString=AAEGADTTGATPK///3 iRT=-19.559999465942383
Num peaks: 26
147.11281       0.32989058      "y1/0.0ppm"
72.04439        0.047566075     "b1/0.0ppm"
244.16557       1.0     "y2/0.0ppm"
122.586426      0.0008872726    "y2^2)/0.0ppm"
143.0815        0.6619359       "b2/0.0ppm"
345.21326       0.24960361      "y3/0.0ppm"
272.12408       0.062890545     "b3/0.0ppm"
416.25037       0.048209973     "y4/0.0ppm"
329.14557       0.08895395      "b4/0.0ppm"
165.07642       0.0023177553    "b4^2)/0.0ppm"
473.27182       0.16780965      "y5/0.0ppm"
237.13956       0.0058048028    "y5^2)/0.0ppm"
400.18268       0.02851274      "b5/0.0ppm"
574.3195        0.093369454     "y6/0.0ppm"
515.2096        0.035596747     "b6/0.0ppm"
258.10846       0.002202321     "b6^2)/0.0ppm"
675.3672        0.03287529      "y7/0.0ppm"
616.2573        0.0065132547    "b7/0.0ppm"
308.6323        0.003955793     "b7^2)/0.0ppm"
790.3941        0.007055764     "y8/0.0ppm"
359.15613       0.004619954     "b8^2)/0.0ppm"
387.66687       0.0025268008    "b9^2)/0.0ppm"
918.4527        0.003008747     "y10/0.0ppm"
423.18542       0.005045848     "b10^2/0.0ppm"
473.70926       0.0041979477    "b11^2/0.0ppm"
522.23566       0.002945542     "b12^2/0.0ppm"
tkschmidt commented 4 years ago

probably my fault. The parser for generating the comments is using insufficient amount of characters to save the field. It will hopefully be fixed in the next release (coming soonTM). does it break some of your pipelines?

gsaxena888 commented 4 years ago

I just ignored the "extra" parentheses, so it didn't break the pipeline (assuming ignoring is ok). Would this "next release" also include 1) the latest Prosit 2020 model or 2) the logic to support neutral loss fragments and support for modifications others than oxidation of methionine? (FYI: Currently, we're testing the code on a Google Cloud virtual machine with a single Nvidia GPU.)