Open xiaoxHuang opened 3 months ago
Hi,
The docs are now for 1.9.1, i.e. don't match the output of 1.8.1.
I want to know which one or some values are used as filters to get the pr.matrix?
df <- df[df$Q.Value <= 0.01 & df$Lib.Q.Value <= 0.01,] for 1.8.1 with MBR, without MBR replace Lib with Global.
I can not reproduce the results sometimes.
This is the most popular question here :) If you wish, I could take a look at the data (I need full logs & to know what is the file name of the matrix you are looking at), but there's no practical reason why you'd want to reproduce the matrix though. If you work in R or Python, the advice is to never use matrices.
Best, Vadim
Hi,
Thanks for your reply.
If you work in R or Python, the advice is to never use matrices
.
The reason that I want to reproduce the pr.matrix.tsv is to make sure that I can get the reliable results as diann does (because diann outputs the pr.matrix.tsv).
You recommend not to use matrics, does it mean that the main out without filtering can be used as the final result?
Thanks!
Best regards
You recommend not to use matrics, does it mean that the main out without filtering can be used as the final result?
Please do filter, see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5581544/ for basics on how to filter data. Please also see the "How to choose the FDR/q-value threshold?" section of https://github.com/vdemichev/DiaNN?tab=readme-ov-file#frequently-asked-questions and https://github.com/vdemichev/DiaNN?tab=readme-ov-file#match-between-runs.
I guess the DIA-NN docs are missing a dedicated section on output filtering. I will add.
Best, Vadim
Hi Vadim
Thanks for your work!
I used diann-1.8.1 in linux. I want to reproduce the pr.matrix.tsv from the main out. As you mentioned here:
using global q-values for protein groups and both global and run-specific q-values for precursors
But there are
Q.value, PG.Q.value, Global.Q.value, Global.PG.Q.value, Lib.Q.value, Lib.PG.Q.value
in the main out. I want to know which one or some values are used as filters to get the pr.matrix? In the cmd line, I tried both--qvalue 0.01 --matrix-qvalue 0.01
and--qvalue 0.03 --matrix-qvalue 0.03
. Then got the pr.matrix by Python to compare the results with those saved by DIANN. I can not reproduce the results sometimes.You also mentioned that
All the 'matrices' can be reproduced from the main .parquet report, if generated with precursor FDR set to 5%, using R or Python.
So, I wonder if the 'qvalue 0.05' is a must for user to reproduce by R or Python?
Thanks!
Best wishes