topdownproteomics / ProteoformNomenclatureStandard

ProForma, a Proteoform Notation Standard
https://topdownproteomics.github.io/ProteoformNomenclatureStandard/
4 stars 5 forks source link

Global modifications (modifications for a specific AA over the whole sequence) #21

Open sgibb opened 7 years ago

sgibb commented 7 years ago

In MS proteomics sometimes iodoacetamide is used to break the disulfide bonds. This yield to iodoacetylated cysteine residues (Unimod:4).

Currently you have to annotated the sequence as follows:

[Unimod]+C[4]SEQUENC[4]EC[4]

That would be difficult and error prone for humans for long sequences (could be done by a computer and regular expressions easily).

It would be nice if there could be a prefix tag for global modifications as well. As basis for discussion I would suggest the > operator:

[Unimod]+C[4]>CSEQUENCEC

Or would something like this destroy the human readability?

stefanks commented 7 years ago

The use of iodoacetamide implies that we are not discussing intact proteoforms as they exist in live cells, but rather the observed ones that may have sample handling artifacts.

I would argue that since this is not an issue for proteoforms in their natural form, there is no need for a special short-hand notation to emphasize "global" modifications.

More thoughts, anyone?