vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
277 stars 54 forks source link

Unexpected results when comparing ddaPASEF to predicted library #474

Open JoergDoellinger opened 2 years ago

JoergDoellinger commented 2 years ago

Dear Vadim,

thanks for the amazing software!!!

I just reanalyzed diaPASEF data from the Mann Lab (quadruplicate HeLa data using 30 SPD from https://www.mcponline.org/article/S1535-9476(22)00087-1/fulltext). In the original paper a ddaPASEF library was used, while I used an in silico predicted library by DIA-NN using a Uniprot proteome fasta. When comparing the stats.tsv files the protein IDs are higher for the ddaPASEF library (~7000 vs 7700 proteins). However, when comparing the pg_matrix.tsv files the situation changes and the predicted library performs better (8504 proteins) than the ddaPASEF library (8055 proteins). I understand that the protein IDs in the stats.tsv and pg_matrix.tsv differ because different q.values are used for filtering but why are the numbers so different for the two libraries when comparing the pg_matrix.tsv files?

Best regards

Joerg

vdemichev commented 2 years ago

Hi Joerg,

Yes, library might be better than lib-free when analysing individual runs - this is fully expected. How does the log look like? Was DIA-NN 1.8.1 used with default setting? Or DIA-NN 1.8 with --relaxed-prot-inf? If not, this would explain the discrepancy. Yes, naturally filtering is different, also in one case it's unique proteins and in another - protein groups.

Best, Vadim