vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
240 stars 50 forks source link

Library free search unknown modifications #998

Open RobbinBouwmeester opened 2 months ago

RobbinBouwmeester commented 2 months ago

Dear @vdemichev,

I was checking the predicted library prior to a search and as expected some of the (unknown) modifications do not have predictions that accounts for the modification (e.g., --var-mod UniMod:1363,68.026215,K). As a result the modification is ignored. This behavior is I believe as expected.

After searching a file the outputted library does contain a value for the predicted retention time. If I am right this value is obtained from the searched data to create a new spectral library. However, this predicted retention time value can be quite different from the apex that is observed according to DIA-NN (in the single file I searched).

So I wonder if the predicted value is actually predicted with a model or that an observation from the raw file is used. If the latter is the case, how does it determine the apex? And why is there a discrepancy with the reported apex?

Thank you!

Kind regards,

Robbin

vdemichev commented 2 months ago

Hi Robbin,

Yes, please use --strip-unknown-mods to make DIA-NN ignore the modification and just predict as if the peptide was unmodified. This is of course suboptimal, but we also made this work better in beta 39 here https://osf.io/q8kfc/?view_only=5e77d3c62563468280fd09265583dbbd.

Predicted.RT is the aligned reference RT from the library. That is if you are looking at the output of searching with a predicted lib without MBR, then the reference RT in this library is purely an in silico prediction. DIA-based empirical libs generated by DIA-NN however contain experimental RTs (for each peptide - corresponding to the run in which it was identified most confidently) aligned to the reference RT scale of the library that is used for searching. That is, if you generate an empirical lib by searching data with an in silico lib, then the RTs in that empirical lib will be the empirical RTs aligned to the in silico scale, which in DIA-NN roughly corresponds to the iRT scale. Hope this helps.

Best, Vadim