PNNL-Comp-Mass-Spec / Informed-Proteomics

Top down / bottom up, MS/MS analysis tool for DDA and DIA mass spectrometry data
29 stars 9 forks source link

FDR filtering #32

Open WinkelsK opened 2 years ago

WinkelsK commented 2 years ago

Hi all, I have a question about filtering MSPathFinder results after target/decoy search. The PrSMs in the *lcTda.tsv-file are propably not filtered for a certain FDR. So I have to do it afterwards. I am now wondering what value I should use: Whats the difference between QValue and PepQValue? And if I want a final FDR of 1%, do I just exclude all PrSMs with a higher value than 0.01? Thanks a lot! Cheers, Konrad

dtabb73 commented 2 years ago

Hi, Konrad. Because I routinely run MSPathFinderT with only target sequences, I filter PrSMs to require e-values below 0.01 (a similar rule exists in TopPIC) and Probability scores above 0.5. It sounds as though you are running your searches on both target and decoy sequences, so you are in better shape for hitting a particular TDA-estimated FDR. Yes, you can probably just retain the QValues below some maximum to achieve that FDR. I believe PepQValue will be differentiated from QValue by the fact that some proteoforms match multiple spectra; if you're computing FDR by estimated erroneous proteoforms rather than PrSMs, you could go that route instead. Good luck! Dave

WinkelsK commented 2 years ago

Hi Dave, thanks for the insights! Makes totally sense how you describe it :) I'll test out the QValue and PepQValue as quality control filters and see which one works better! Thanks again! Best, Konrad