vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
283 stars 53 forks source link

Issues recreating the automatically generated report.pg_matrix (unsure of filters used) #1273

Open ore-sol opened 1 day ago

ore-sol commented 1 day ago

I have a search that was done with MBR on and standard MaxLFQ normalization, but when I run this code in R to recreate the report_pg.matrix, I end up with more protein group IDs than in the original automatically generated matrix.

prot_maxLFQ<- df %>% dplyr::filter(Lib.PG.Q.Value <= 0.05 & Lib.Q.Value<=0.05 ) %>% diann_matrix(id.header="Protein.Group", quantity.header = "PG.MaxLFQ", pg.q=0.05)

However, when I set the above cutoff values to 0.01, however, I end up with less IDs than in the automatically generated pg_matrix.

When I use these global filters instead, I end up with less IDs again. prot_maxLFQ<- df %>% dplyr::filter(Global.PG.Q.Value <= 0.05 & Global.Q.Value<=0.05) %>% diann_matrix(id.header="Protein.Group", quantity.header = "PG.MaxLFQ", pg.q=0.05)

(It might be helpful to mention that I don't have this issue with recreating the PR matrix.) What filters exactly are being used to generate the report.pg_matrix? Thank you!

vdemichev commented 11 hours ago

Hi,

The filters used to generate the matrices are described here: https://github.com/vdemichev/DiaNN?tab=readme-ov-file#output (for 1.9.2).

Best, Vadim