kusterlab / prosit

Prosit offers high quality MS2 predicted spectra for any organism and protease as well as iRT prediction. When using Prosit is helpful for your research, please cite "Gessulat, Schmidt et al. 2019" DOI 10.1038/s41592-019-0426-7
https://www.proteomicsdb.org/prosit/
Apache License 2.0
85 stars 45 forks source link

Collision Energy Optimization #20

Closed jonasfoe closed 5 years ago

jonasfoe commented 5 years ago

Dear Prosit team,

I have been trying to replicate a workflow for optimizing collision energies as suggested in this presentation: https://skyline.ms/_webdav/home/software/Skyline/events/2019%20User%20Group%20Meeting%20at%20ASMS/%40files/Presentations/02-Schmidt.pdf

The main picture for this is: prosit_figure The image suggests to pick a collision energy of 28.

When I try to replicate this for an example peptide of mine, I get e.g. prosit_figure_ALNEKLVNL This might tempt me to pick an energy of 28.

When I measure empirically though, and adjust the visualization a bit, the situation looks different: ALNEKLVNL_nce_test Whereas strongest fragments per spectrum seem similar, it is revealed, that the absolute fragment intensities also have a strong dependence on the NCE.

The fact that Prosit only returns relative intensities makes it hard to pick the best collision energy for my work. Do you think it would be possible for Prosit to return intensities on an arbitrary scale that is somewhat comparable between spectra?

gessulat commented 5 years ago

Dear @jonasfoe,

thank you for digging so deep into Prosit's behavior, that graph looks interesting! :)

Mid-term, we do not think that we could build a model that can predict absolute fragment intensities, for several reasons:

  1. To get best performance in the model we are using, numerically values have to be normalized between 0 and 1. We would need a re-transformation in any case to get to absolute intensities
  2. Absolute intensities are dependent on various parameters many of which we do have incomplete control over.

For example

What you could think of, is to learn a transformation function that is specific to you data and use the relative intensities by Prosit. I am unsure if this would help in your application unfortunately.