searlelab / chronologer

Retention time prediction
Apache License 2.0
2 stars 3 forks source link

Undefined and missing modifications #1

Open PigeonMark opened 1 year ago

PigeonMark commented 1 year ago

While processing your dataset, I encountered a couple of inconsistencies between the dataset and the overview of the PTMs in the README.

There are 1746 peptides containing an acetylation on the N-terminal methionine (e.g. M[+42.010565]APGQLALFSVSDK) which is not possible to my knowledge. There is also 1 peptide containing an acetylation on the N-terminal Glutamine (Q[+42.010565]PAAPP). Is this a different modification on the sidechain of M and Q or is this an N-terminal acetylation instead?

A second issue is that Cyclized S-CAM-Cys [+39.994915] is not in the dataset. Instead, C[-17.0] is given. This is the mass shift due to the cyclization (loss of ammonia) of carbamidomethylated cysteine, while +39.994915 is the mass difference compared to cysteine.

Finally, there is only phosphorylation of Serine (S) in the dataset and not of Threonine (T) and Tyrosine (Y) as was given in the README. Are these expected to be missing?

Some more small inconsistencies: