Nesvilab / MSFragger

Ultrafast, comprehensive peptide identification for mass spectrometry–based proteomics
https://msfragger.nesvilab.org
107 stars 7 forks source link

Question about PTM format #311

Closed JannikSchneider12 closed 8 months ago

JannikSchneider12 commented 8 months ago

Hello everyone,

I want to parse modifications from Unimod to a valid format that I can use for MSFragger and would have some questions regarding the format of the residues.

If I have a modification that can occur at a terminus as well as on specific residues, is it feasable to just write the residues and e.g. an "n" inside or does the "n" and "c" also affect the neighboring residues and only refer to them?

E.g. if I have this entry about Acetyl from Unimod:

[Term] id: UNIMOD:1 name: Acetyl def: "Acetylation." [RESID:AA0048, RESID:AA0049, RESID:AA0041, RESID:AA0052, RESID:AA0364, RESID:AA0056, RESID:AA0046, RESID:AA0051, RESID:AA0045, RESID:AA0354, RESID:AA0044, RESID:AA0043, PMID:11999733, URL:http\://www.ionsource.com/Card/acetylation/acetylation.htm, RESID:AA0055, PMID:14730666, PMID:15350136, RESID:AA0047, PMID:12175151, PMID:11857757, RESID:AA0042, RESID:AA0050, RESID:AA0053, RESID:AA0054, FindMod:ACET, UNIMODURL:http\://www.unimod.org/modifications_view.php?editid1=1] xref: record_id "1" xref: delta_mono_mass "42.010565" xref: delta_avge_mass "42.0367" xref: delta_composition "H(2) C(2) O" xref: username_of_poster "unimod" xref: group_of_poster "admin" xref: date_time_posted "2002-08-19 19:17:11" xref: date_time_modified "2017-11-08 16:08:56" xref: approved "1" xref: spec_1_group "1" xref: spec_1_hidden "0" xref: spec_1_site "K" xref: spec_1_position "Anywhere" xref: spec_1_classification "Multiple" xref: spec_1_misc_notes "PT and GIST acetyl light" xref: spec_2_group "2" xref: spec_2_hidden "0" xref: spec_2_site "N-term" xref: spec_2_position "Any N-term" xref: spec_2_classification "Multiple" xref: spec_2_misc_notes "GIST acetyl light" xref: spec_3_group "3" xref: spec_3_hidden "1" xref: spec_3_site "C" xref: spec_3_position "Anywhere" xref: spec_3_classification "Post-translational" xref: spec_4_group "4" xref: spec_4_hidden "1" xref: spec_4_site "S" xref: spec_4_position "Anywhere" xref: spec_4_classification "Post-translational" xref: spec_5_group "5" xref: spec_5_hidden "0" xref: spec_5_site "N-term" xref: spec_5_position "Protein N-term" xref: spec_5_classification "Post-translational" xref: spec_6_group "6" xref: spec_6_hidden "1" xref: spec_6_site "T" xref: spec_6_position "Anywhere" xref: spec_6_classification "Post-translational" xref: spec_7_group "7" xref: spec_7_hidden "1" xref: spec_7_site "Y" xref: spec_7_position "Anywhere" xref: spec_7_classification "Chemical derivative" xref: spec_7_misc_notes "O-acetyl" xref: spec_8_group "8" xref: spec_8_hidden "1" xref: spec_8_site "H" xref: spec_8_position "Anywhere" xref: spec_8_classification "Chemical derivative" xref: spec_9_group "9" xref: spec_9_hidden "1" xref: spec_9_site "R" xref: spec_9_position "Anywhere" xref: spec_9_classification "Artefact" xref: spec_9_misc_notes "glyoxal-derived hydroimidazolone" is_a: UNIMOD:0 ! unimod root node

Would it be correct to specify the modification like this?

42.010565 CHKRSTY[n 3

Or if I would have something like this that can affect a "K" residue as well as any N-term, would this be correct?

226.077598 Kn

Or would this mean that just a N-terminal K is affected and not "K or any N-term" as it should be?

Or do I need several specified modifications, one for terminal and one for only the residues?

Thank you very much for your time and help

fcyu commented 8 months ago

is it feasable to just write the residues and e.g. an "n" inside or does the "n" and "c" also affect the neighboring residues and only refer to them?

n and c are "modifier" not "real residues". For peptide N-term, you should use n^.

Would it be correct to specify the modification like this? 42.010565 CHKRSTY[n 3

No, is should be 42.010565 CHKRSTYn^ 3 because there is peptide N-term.

Or if I would have something like this that can affect a "K" residue as well as any N-term, would this be correct?

226.077598 Kn

No, it should be 226.077598 Kn^.

I just replied a similar question here https://github.com/Nesvilab/MSFragger/issues/308, I think it can answer your questions.

Best,

Fengchao