vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
278 stars 54 forks source link

long library processing for analysis #748

Open giovanni-cro opened 1 year ago

giovanni-cro commented 1 year ago

Dear developers,

I am new to DIA-NN and I was wondering if I am doing something wrong with the following library generation: 1) I build a Spectral library from a FASTA uniprot file, selecting FASTA digest; 2) I use this library (format .predicted.speclib) to match the spectra in the .RAW file; I see that step 1) required something like 90 minutes for the library to be digested, and that step 2) required 90 min (as in step 1) plus 10-20 min to process the actual samples. 3) I then tried to use the library generated as output after the processing of the samples (extension .tsv) and this time the library initialization took 30 seconds. The proteins identified with library from 3) were almost as many as those identified with the library from 2).

Despite the big advantage in time, is there a reason to choose one library instead of the other?

Thank you in advance, Giovanni

vdemichev commented 1 year ago

Hi Giovanni,

Please see the docs on library-free search and MBR. I think the step 2 likely has unnecessary spectra prediction (the deep learning option should be unchecked for this). Step 3 will result in the optimal results. But you can combine 2 and 3 automatically with MBR.

Best, Vadim