lazear / sage

Proteomics search & quantification so fast that it feels like magic
https://sage-docs.vercel.app
MIT License
201 stars 38 forks source link

Protein inference #67

Closed hannes-rezo closed 1 year ago

hannes-rezo commented 1 year ago

Hi,

Does Sage include an option for protein inference from the PSMs? If not, are the output files (either .tsv or .pin) compatible with any good tools that can infer proteins from the PSMs, or do you have a preferred way of doing this?

Many thanks for developing Sage - it's a great tool!

lazear commented 1 year ago

Adding in protein inference is on the roadmap (likely via parsimony/greedy bipartite graph set cover), but I have held off on doing it so far because there isn't really consensus in the field on the "best" way to do it - I need to take some time and examine what people are currently doing.

For most quantitative applications I generally only use information associated with unique peptides, so I tend to completely skip protein inference, using protein group level FDRs as reported by sage or percolator/mokapot.

That being said, I imagine the pin or tsv files should contain sufficient information for plugging into almost any existing tooling for performing protein inference. Rolling your own bipartite graph reduction also isn't too bad!

hannes-rezo commented 1 year ago

Thank you - that's helpful!