vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
279 stars 54 forks source link

how to remove proteins identified by less than 2 peptides ? #319

Closed ihorrible closed 2 years ago

ihorrible commented 2 years ago

Hi everyone !

Is it somehow possible to remove the protein groups identified by less than 2 peptides? Or what is the algorithm in DIA-NN for this? I don't see any columns in "report.pg_matrix" or any parameters in interface which could help to do this. The manual also keeps silence on this matter. For e.g. in MS Fragger it looks like this (see file attached)

peptides indentification

vdemichev commented 2 years ago

After loading the pr_matrix in R pr <- diann_load('report.pr_matrix.tsv') Can get the number of peptides per protein with just two commands:

df <- unique(pr[,c('Protein.Group','Stripped.Sequence')])
t <- table(pr$Protein.Group)