kusterlab / prosit

Prosit offers high quality MS2 predicted spectra for any organism and protease as well as iRT prediction. When using Prosit is helpful for your research, please cite "Gessulat, Schmidt et al. 2019" DOI 10.1038/s41592-019-0426-7
https://www.proteomicsdb.org/prosit/
Apache License 2.0
84 stars 47 forks source link

iRT definition #49

Closed hcji closed 3 years ago

hcji commented 3 years ago

Hi,

Thanks for your great work, but I have a question: is the definition of the iRT predicted by prosit the same as the paper referred by the reference "Using iRT, a normalized retention time for more targeted measurement of peptides"? (doi: 10.1002/pmic.201100463)

Because I predicted the iRT of the iRT peptides with prosit and the pre-trained model, I got: LGGNEQVTR | -9.96795 GAGSSEPVTGLDAK | 24.7797 VEATFGVDESNAK | 37.5635 YILAGVENSK | 48.0256 TPVISGGPYEYR | 53.7346 TPVITGAPYEYR | 59.9253 DGLDAASYYAPVR | 72.0672 ADVTPADFSEWSK | 82.2792 GTFIIDPGGVIR | 108.194 GTFIIDPAAVIR | 126.755 LFLQFGAQGSPFLK | 136.645

But in the reference is: LGGNEQVTR | −24.92 GAGSSEPVTGLDAK | 0 VEATFGVDESNAK | 12.39 YILAGVENSK | 19.79 TPVISGGPYEYR | 28.71 TPVITGAPYEYR | 33.38 DGLDAASYYAPVR | 42.26 ADVTPADFSEWSK | 54.62 GTFIIDPGGVIR | 70.52 GTFIIDPAAVIR | 87.23 LFLQFGAQGSPFLK | 100

There is a significant difference.

tkschmidt commented 3 years ago

Hey hcji, iRT is a concept of how to reference retention time within a given set of peptides. Meaning: we took two peptides within the ProteomeTools project and defined them as 0 (peptide A) and 100 (peptide B) of our iRT scale. The rest of the data was then linearly scaled to those two reference points in all measurements and afterward used for training and prediction. Escher did something similar on their peptide set.

Now, if you want to apply it to some measured iRT or RT, you have to transfer the predicted iRT in your iRT/RT space. Therefore, you can use simple linear regression and create a fitting function between iRT_Prosit and iRT of yours.

image So by using 0,8485 * predicted_prosit_irt - 18,537 you can reasonably well create Escher iRT values.

Most commonly, slightly advanced methods like LOESS are used for creating this fitting function.

I hope this answer helps you, but you can find more information about the fitting of the iRT model in our method section. but it is not meant as a fixed