vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
281 stars 53 forks source link

Proteotypic peptides #1187

Open SiProt opened 1 month ago

SiProt commented 1 month ago

Dear Vadim, I analysed thermo raw files with DIA-NN 1.8.1 and 1.9. I matched the generated pg_matrix and pr_matrix to find the number of proteotypic peptides for each reported protein in the pg_matrix. I found that in either version of DIA-NN, the generated pg_matrix file contains proteins identified without proteotypic peptides and thus reported as e.g. ProtA;ProtB with a normalised intensity. I may miss something but in the report.log txt file I read, e.g., Number of genes identified at 1% FDR: 5080 (precursor-level), 4313 (protein-level) (inference performed using proteotypic peptides only) thus I thought to find proteingroups identified with at least 1 proteotypic peptide. Is that normal? Should I just remove proteins without proteotypic peptides before further elaborations?. Thanks!

vdemichev commented 1 month ago

Hi,

You can use the unique_genes_matrix to obtain the quantities of genes quantified with only proteotypic peptides or (recommended) just use the main report in .parquet format.

Best, Vadim

SiProt commented 1 month ago

Many thanks!