HUPO-PSI / ProForma

HUPO-PSI Standardized peptidoform notation
15 stars 4 forks source link

Request specification clarification on sequence truncations #4

Open dtabb73 opened 3 years ago

dtabb73 commented 3 years ago

I would like to see a paragraph in the specification indicating how proteoform sequence truncations are to be specified. N-terminal truncations may be biological, as in the removal of the initial Met (perhaps with PTM) or the cleavage of a signal peptide or the action of a viral protease. The truncations may be instead be related to sample treatment, such as a rare cutter like CNBr for middle-down proteomics or due to a "hot" ion source. I believe ProForma should specify how a proteoform sequence compares to the sequence described by the accession, such as indicating the position of the first and last amino acids in the accession's sequence. Are amino acids preceding and succeeding the proteoform sequence expected to be included?

javizca commented 3 years ago

In my view, this is "metadata" information on top of the actual protein sequence. In the current version of the specification, we decided to handle those issues using the INFO tag providing the metadata there as free text.

Standardise every single annotation at this point is unfeasible in my view.

edeutsch commented 3 years ago

I think all this is beyond the scope of ProForma 2.0. ProForma is designed to describe the molecule that (someone claims) yielded a spectrum. Information about:

Basically ProForma is about what you think you have observed, not about what you infer about context of that observation. (with a minor exception that there is some ambiguity between I/L and a few other isobaric ambiguiities where ProForma allows the user to express one, but it is implied that isobaric alternatives are possible)