levitsky / pyteomics

Pyteomics is a collection of lightweight and handy tools for Python that help to handle various sorts of proteomics data. Pyteomics provides a growing set of modules to facilitate the most common tasks in proteomics data analysis.
http://pyteomics.readthedocs.io
Apache License 2.0
105 stars 34 forks source link

Proforma modification hashing and other fixes #63

Closed mobiusklein closed 2 years ago

mobiusklein commented 2 years ago

This PR introduces an approach to #62 and fixes the following issues:

  1. GenericModification incorrectly defaulted to just using UnimodResolver instead of the cascade of different resolvers. It now properly cascades.
  2. When a tag was assigned a group_id but also had an extra tag that also specified a group, if that group is the same as the main tag's group_id, the main tag will not render its group_id since the extra tag will do it (along with other information).
  3. Disambiguate a few more alternative spellings of monosaccharides, while slightly complicating the tokenizer to deal with the interspersed digits.
levitsky commented 2 years ago

Thank you. I can't really assess and audit these changes but I'm happy to accept them if you're done.

mobiusklein commented 2 years ago

I added a bit more documentation to describe the change. I may come back and add more later but this is good to go functionally.

Thank you