bigbio / proteomics-sample-metadata

The Proteomics sample metadata: Standard for experimental design annotation in proteomics datasets
GNU General Public License v2.0
76 stars 107 forks source link

About annotation for peptide mixtures #452

Open daichengxin opened 3 years ago

daichengxin commented 3 years ago

Hi All,In the LC-MS Analysis of Isoaspartylated Peptides part of the PXD006112 project, the experiment uses a mixture of 12 peptides to conduct the experiment, so Organization、 Organization part, etc. will not be applicable, and some unique information such as peptide quality may need to be added. Maybe we can learn from the key-value method of the spiked-in experiment。

image

daichengxin commented 3 years ago
Source Name Peptide Peptide data file
Sample1 CT=peptide;Mass=976.448;PS=AGFAGDDAPR CT=peptide;Mass=1265.612;PS=DGNGYISAAELR 20140919_QE6+_LC5_CY_SA_IsoAsp_40min_12mix_01
levitsky commented 3 years ago

Hi, could you please specify the question or suggestion?

I see you've added peptide molecular masses in the annotation and are talking about other keys, but in the example table only masses are listed. In this case, however, the mass can be trivially calculated from the sequence, so I'm not sure I understand the benefit of adding it.

daichengxin commented 3 years ago

Sorry. First of all, the experiment uses a mixture of 12 synthetic peptides, so organism, organism part, cell type, etc. will no longer be applicable. I am not sure if it can pass sdrf-validation. Then for peptide 11 and peptide 12, where the isoaspartic acid residues are underlined. I am not sure if using underscores in annotation is a good option to express the isomer relationship of amino acids.Finally, I think both Compound vendor (CV) and Compound specification URI keys should be applicable in this case. For this article, CV=Thermo Scientific.

Thanks!!

levitsky commented 3 years ago

Ah, sorry! I was thinking you about the annotation scheme for spiked compounds, but if in this case the sample itself is a relatively simple peptide mixture, then it's a different story.

We have some examples with synthetic proteomes already annotated. Organism and some other things are still mandatory, but can be set to "not applicable". There is also characteristics[synthetic peptide] that should be added. Arguably this scheme applies to your case.

Exhaustively annotating every component of the sample might be too much. Perhaps a comment with some specification URI is all that's needed here. Twelve peptides per file, including non-standard amino acids, seem like a lot of work to fit into the format. Do you think it's worth it?