HUPO-PSI / mzTab

mzTab Reporting MS-based Proteomics and Metabolomics Results
https://hupo-psi.github.io/mzTab
37 stars 17 forks source link

Peptide to protein mapping in quant files #32

Open andrewrobertjones opened 6 years ago

andrewrobertjones commented 6 years ago

If a peptide can be mapped to multiple proteins, the 1.0 specs recommend duplicating the rows, and just changing the accession. I have a strong preference to change this so that multiple accessions can be separated by semi-colons (or other second separator).

Otherwise this can cause problems for stats/visualisation or other software that wants to work with the quant data. Logic to work out duplicates would need to be encoded

timosachsenberg commented 6 years ago

+1

ypriverol commented 6 years ago

@andrewrobertjones @timosachsenberg is this same peptide mapping to different anchor proteins?

andrewrobertjones commented 5 years ago

Just looking at this again now (while at PSI 2019 and thinking about mzTab dev), I think the peptide to protein link generally needs improving for mzTab for quant workflows. The current specs work only if peptides are reported that are supporting quantification i.e. if shared/conflicted peptides exist, they are not reported at all. Otherwise it is impossible to infer which peptides were used to quantify which protein. Even with the current encoding, it would not be possible to figure out how many peptides are shared with ambiguity group members, unless they are duplicated across multiple rows