compomics / peptide-shaker

Interpretation of proteomics identification results
http://compomics.github.io/projects/peptide-shaker.html
48 stars 18 forks source link

How to exlcude doubtfull PSMs before export to mzid #146

Closed XsirdanielX closed 8 years ago

XsirdanielX commented 8 years ago

This might be a really basic question, but how can I exclude doubtful PSMs before I export the results to mzid? Thanks for your help,

Daniel

mvaudel commented 8 years ago

Hi Daniel,

Sorry, this is not possible at the moment. mzid is designed to contain results of an experiment as comprehensively as possible, it is up to tools working on these files to select the ones of interest. This is particularly important when sharing results for publication where you want to provide all data transparently, in an unbiased fashion and avoiding threshold effects.

Imagine that 10 years from now, you realize that you were the first to find the biomarker, but it did not pass the FDR threshold at the time of analysis due to the lower performance of the instrument/software. 10 years from now, you will be able to document that you were the first to find it because it is still in the mzid file, but due to the lower technical performance it did not make it to the paper. If you filter out data arbitrarily, it is gone forever, you cannot document your discovery.

Note that the confidence levels and FDR validation statuses are indicated for every match in the mzid file according to the specifications of the format, and we are collaborating with the PSI to improve this annotation, notably for the annotation of PTM scores. It is thus straightforward for a tool working on these files to filter out non confident matches.

We are working on filtered exports as a hack for tools which do not support this. However, the criteria for filtering are very dependent on the downstream application, so we cannot anticipate all use cases. We have tried to crowd source ideas on how to filter PSMs, e.g. for targeted quantification, but only one person showed interest, and due to limited workforce we never had the time to implement it. You are very welcome to contribute :) https://groups.google.com/forum/#!topic/peptide-shaker/4nuDGQM4vt4

Hope this answers your question, feel free to write again in case anything is unclear,

Marc

XsirdanielX commented 8 years ago

Hi Marc,

thanks a lot for your answer and also for creating this great tool, which I think is very useful.

A recent problem we were facing because of non-valid PSMs in our mzid files on PRIDE, were concerns from a referee, who mistrusted our data analysis because of several, apparently wrong PSMs, in peptide-shaker marked as not-valid/ doubtful. After manual validation of our data these matches were of course excluded and not present in the results of the manuscript but however lead to the rejection of the Paper. As MCP requires a submission of the MS data to proteomeXchange, what do you think would be the propitiate way to deal with this regarding the rejection and of course for future submissions?

Best regards and thanks again,

Daniel

mvaudel commented 8 years ago

Hi Daniel,

I was very sorry to read this, it is completely inacceptable in my eyes. You seem to have been very unlucky with this reviewer. If your manuscript was rejected on the basis that your file contains low quality hits, marked as non-validated, and not used in your results, you should definitely appeal of this decision. If you send me more information by email, I will be able to justify that you did everything in accordance to the good practices of the field.

There is not much I can advise for future submissions, as submitting all data to ProteomeXchange is the state of the art in proteomics. On my side, after a few issues of this kind I stopped sending manuscripts to MCP, but this is only a personal choice.

Best regards and sorry again,

Marc

hbarsnes commented 8 years ago

Hi Daniel,

As this is not really an issue with PeptideShaker we will now close the issue and rather continue the discussion via e-mail.

Best regards, Harald