forlilab / Meeko

Interface for AutoDock, molecule parameterization
https://meeko.readthedocs.io/
GNU Lesser General Public License v2.1
204 stars 49 forks source link

Add residue templates to disambiguate Amber resname and CCD names #216

Closed rwxayheee closed 3 weeks ago

rwxayheee commented 3 weeks ago

This is for #210. It only includes addition of chemical templates, and it does not change the matching of existing residue names. In this PR, 107 new ambiguous residue names and 567 unique templates are added to the default chemical template file, residue_chem_templates.json. It's also possible to distribute the new templates by libraries, or as a separate file.

The technical details of the additional templates are as follows:

diogomart commented 3 weeks ago

This is awesome

rwxayheee commented 3 weeks ago

I will merge a little later today, after some visual inspection. Thanks for the approval! ^^

rwxayheee commented 3 weeks ago

Some additional notes for future reference:

Since bond order and formal charge can't be parsed from Amber OFF lib, it's not guaranteed that the Smiles strings in the chemical templates are the preferred/intended resonance form. For example, Amber residue 1MA: https://github.com/forlilab/Meeko/blob/ae23f949c642a17291972e70f779ea3d4ebfe8c3/meeko/data/residue_chem_templates.json#L1304-L1307 In the creation (guess) of conjugate bond system, a connected graph of atoms needing valence without changing its formal charge is considered, and the double bonds are first placed on the longest Eulerian path with an even number of edges. This is impossible when there are more than 2 odd (1-degree) nodes. 1MA is an example that contains a subgraph that has 3 odd nodes. The current compromise strategy is to remove the closest leaf node to a high-degree node. In 1MA, the valence of the removed node is conpensated by increasing the bond order with and upcharging the nitrogen in -NH2. This process doesn't pick a particular nitrogen.

In short, it's understood that the Smiles could sometimes be alternatively written with a different resonance form. But this doesn't seem to really affect the matching we do in Meeko.