vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
266 stars 53 forks source link

Cannot perform mass calibration, too few confidently identified precursors #1026

Closed Vectolyser closed 2 months ago

Vectolyser commented 4 months ago

Hi, due to the limitations of DDA library in identifying protein, I tried to use my own in-silico tyrpsin tool to randomly create some peptides, generate a library and use it for DIA data analysis in diaPASEF format. However, no matter how large the library is used (up to 200,000 precursors at a time), not even one precursor can be identified. The DIA-NN displays “Cannot perform mass calibration, too few confidently identified precursors” and “Cannot perform MS1 mass calibration, too few confidently identified precursors”.

The LibraryIntensity and NormalizedRetentionTime of the precursor are generated using a deep learning model, which seems to be no problem, because if it is used to predict the precursor existing in the DDA library matching this DIA data (not the DDA library itself), peptides can be normally identified; at the same time, the PrecursorMz and ProductMz is also correct. However, I did not add the AverageExperimentalRetentionTime, GeneName, PrecursorIonMobility, FragmentLossType columns, they seem to be not necessary.

I also think that the depth of the DIA data is sufficient, because I have tried to merge its DDA library and other libraries (even different species), and many more peptides can be identified.

So what’s wrong? If you can solve my doubts, I will be extremely grateful. Attached is the library I created and the running log. 10.log.txt

silico_hybrid_library_10k_otherpro.zip

vdemichev commented 4 months ago

Hi,

You can check what is the source of the problem. If you have a suitable public library, you can try using that to analyse the raw file in question, or use lib-free search. If this works, then there is likely some mistake in generation of your library.

You can also have DIA-NN convert the library you have generated into its own .tsv format (just specify your lib as input and click 'Generate spectral library', without specifying any raw files). Then you can take a look at that .tsv to see if DIA-NN actually correctly read your library.

to randomly create some peptides

Why would random peptides be found in the data?

Best, Vadim

Vectolyser commented 2 months ago

Thank you for your response. Due to the limitations of the DDA library on the depth of DIA analysis, I attempted to expand the library by adding some new in-silico peptides. The properties of the newly generated peptides were predicted using a deep learning model. After correcting the training data for the RT model, the newly generated, larger extended library indeed identified more peptides than before, confirming my hypothesis.