smith-chem-wisc / MetaMorpheus

Proteomics search software with integrated calibration, PTM discovery, bottom-up, top-down and LFQ capabilities
MIT License
90 stars 46 forks source link

Checkbox or similar to (dis)allow mods on peptide termini (site of protease activity) #1611

Open trishorts opened 5 years ago

trishorts commented 5 years ago

for example: https://twitter.com/NorSivaeb/status/1113123527492866048?s=09 https://twitter.com/NorSivaeb/status/1113124343037464577?s=09

acesnik commented 5 years ago

Shouldn't these be easy to validate? Can you do a group FDR for these type of peptides and show that they're actually false at a higher rate?

trishorts commented 5 years ago

It's not so much a desire to filter out incorrectly placed mods so much as a desire to prevent erroneous assignment of mods to locations that are impossible. As I see it, this issue needs more discussion prior to implementation. currently we don't do anything and such mods, if found through gptmd are reported as target hits. if, like I believe, that no peptide exists, then that is a knowingly wrong output, which is bad for the unsuspecting user and bad for the suspecting user b/c they might think we're ignorant. Furthermore, we might get a better assignment if we prevent a bad one. We already saw that happen when we added notches instead of totally open search.

trishorts commented 5 years ago

found this on xtandem page (unsurprisingly) "It is possible to specify that a variable modification NOT occur at the C-terminus of a peptide. For example, previously "42.010565@K" would have been used to test for K acetylation. Using the new notation, "42.010565@]K" can be used, which will not test C-terminal lysines for acetylation (which are chemically impossible for tryptic peptides). This notation is useful for most lysine post-translational modifications, as well as dimethyl-arginine. Note: monomethyl-arginine and -lysine are both susceptible to trypsin cleavage, so this notation is not recommended for monomethyl variable modifications. It is also not recommended for use with carbamylation — a urea artifact that can occur during tryptic digestion — although reducing the number of carbamylations allowed per peptide, e.g., "43.005814@1K", can be quite useful."