marrink-lab / vermouth-martinize

Describe and apply transformation on molecular structures and topologies
Apache License 2.0
84 stars 37 forks source link

Autogenerate modification auto-mappings #604

Open pckroon opened 1 week ago

fgrunewald commented 1 week ago

@pckroon this seems to cause a flood of warnings:

 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['N-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['N-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['C-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['C-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - Can't find modification mappings for the modifications ['C-ter']. The following modification mappings are known: {'C-ter': <vermouth.map_parser.Mapping object at 0x164a85bb0>, 'COOH-ter': <vermouth.map_parser.Mapping object at 0x164a85d00>, 'N-ter': <vermouth.map_parser.Mapping object at 0x164a85e50>, 'NH2-ter': <vermouth.map_parser.Mapping object at 0x164a85fa0>, 'CCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14130>, 'NCAP-ter': <vermouth.map_parser.Mapping object at 0x164b14280>, 'GLU-HE1': <vermouth.map_parser.Mapping object at 0x164b143d0>, 'GLU-HE2': <vermouth.map_parser.Mapping object at 0x164b14520>, 'ASP-HD2': <vermouth.map_parser.Mapping object at 0x164b14670>, 'ASP-HD1': <vermouth.map_parser.Mapping object at 0x164b147c0>, 'LYS-HZ3': <vermouth.map_parser.Mapping object at 0x164b14910>, 'LYS-LSN': <vermouth.map_parser.Mapping object at 0x164b14a60>, 'HIS-HP': <vermouth.map_parser.Mapping object at 0x164b14bb0>, 'HIS-HD': <vermouth.map_parser.Mapping object at 0x164b14d00>, 'HIS-HE': <vermouth.map_parser.Mapping object at 0x164b14e50>, 'TYRPHOS': <vermouth.map_parser.Mapping object at 0x164b14fa0>}
 WARNING - unmapped-atom - These atoms are not covered by a mapping. Either your mappings don't describe all atoms (bad idea), or, there's no mapping available for all residues. ['171A-ASN21:OXT', '412B-ALA30:OXT'
pckroon commented 1 week ago

Alright, that is a problem, since the modification mappings are known. I'll dig a bit

pckroon commented 1 week ago

It currently produces the following errors:

   ERROR - general - The following atoms do not have a atype: [
{'PTM_atom': True, 'element': 'H', 'replace': {'atomname': 'HN2'}, 'order': 0, 'atomname': 'HN2', 'modifications': [<Modification "N-ter" at 0x7fc99e8a1910>, <Modification "('N-ter',)" at 0x7fc99880fe10>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91890>, 'mapping_weights': {607: 1}, 'resname': 'MET', 'chain': 'A', 'resid': 1, '_old_resid': 1, 'position': array([nan, nan, nan])}, 
{'PTM_atom': True, 'element': 'H', 'replace': {'atomname': 'HN3'}, 'order': 0, 'atomname': 'HN3', 'modifications': [<Modification "N-ter" at 0x7fc99e8a1910>, <Modification "('N-ter',)" at 0x7fc99880fe10>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91b10>, 'mapping_weights': {610: 1}, 'resname': 'MET', 'chain': 'A', 'resid': 1, '_old_resid': 1, 'position': array([nan, nan, nan])}, 
{'PTM_atom': True, 'element': 'O', 'order': 0, 'atomname': 'OXT', 'modifications': [<Modification "C-ter" at 0x7fc99e898850>, <Modification "('C-ter',)" at 0x7fc99880e710>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91650>, 'mapping_weights': {601: 1}, 'resname': 'GLY', 'chain': 'A', 'resid': 76, '_old_resid': 76, 'position': array([4.0862, 3.9575, 3.6251])}]

   ERROR - general - The following atoms do not have a charge_group: [
{'PTM_atom': True, 'element': 'H', 'replace': {'atomname': 'HN2'}, 'order': 0, 'atomname': 'HN2', 'modifications': [<Modification "N-ter" at 0x7fc99e8a1910>, <Modification "('N-ter',)" at 0x7fc99880fe10>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91890>, 'mapping_weights': {607: 1}, 'resname': 'MET', 'chain': 'A', 'resid': 1, '_old_resid': 1, 'position': array([nan, nan, nan])}, 
{'PTM_atom': True, 'element': 'H', 'replace': {'atomname': 'HN3'}, 'order': 0, 'atomname': 'HN3', 'modifications': [<Modification "N-ter" at 0x7fc99e8a1910>, <Modification "('N-ter',)" at 0x7fc99880fe10>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91b10>, 'mapping_weights': {610: 1}, 'resname': 'MET', 'chain': 'A', 'resid': 1, '_old_resid': 1, 'position': array([nan, nan, nan])}, 
{'PTM_atom': True, 'element': 'O', 'order': 0, 'atomname': 'OXT', 'modifications': [<Modification "C-ter" at 0x7fc99e898850>, <Modification "('C-ter',)" at 0x7fc99880e710>], 'graph': <vermouth.molecule.Molecule object at 0x7fc997a91650>, 'mapping_weights': {601: 1}, 'resname': 'GLY', 'chain': 'A', 'resid': 76, '_old_resid': 76, 'position': array([4.0862, 3.9575, 3.6251])}]

The missing atype is a data issue on the modifications; and the missing charge_group is probably an issue with the mapping processor (which makes me very sad). Also note that some of these atoms have nan-positions

pckroon commented 1 week ago

Alright, this should fix most of the mess, at least regarding charge groups. The AA modifications are still missing some critical data, such as charge and atom types. I don't have the time to fix the patch test coverage

pckroon commented 22 hours ago

But I assume you tested this? Perhaps worth to add an integration test instead of patching the code coverage with unit tests

I tested this roughly (make an atomistic 1ubq, see if the resulting itp looks mostly reasonable by staring at it). If/when you have modifications with appropriate parameters I'd be happy to run it again, check it more thoroughly, and add that as integration test.

fgrunewald commented 21 hours ago

@pckroon we're almost there just one ingredient missing: we need to patch the rtp paser to generate all missing dihedrals and pairs

do you want to take a stab at it? I'm not saying a complete rewrite just patch

pckroon commented 21 hours ago

Doesn't https://github.com/marrink-lab/vermouth-martinize/blob/master/vermouth/gmx/rtp.py#L257 already generate all the dihedrals? (Should, anyway). About the pairs, are those the ones meant by "TODO: generate 1-4 interactions between pairs of hydrogen atoms"?

fgrunewald commented 21 hours ago

for lysozyme I'm missing like 4000 dihedrals, all angles, and a bunch of bonds. I think Jon just makes the dihedrals for links? Also for charmm cmaps are missing