marrink-lab / vermouth-martinize

Describe and apply transformation on molecular structures and topologies
Apache License 2.0
84 stars 37 forks source link

isomorphism propagation #601

Open csbrasnett opened 3 weeks ago

csbrasnett commented 3 weeks ago

I have a protein where I'm mutating a several identical residues to the same target. One of the major slowdowns in doing this is the graph repair - and looking at the debugging - running ISMAGS. From my understanding of how the mutations/modifications are dealt with in the graph_repair, I think this could be sped up if the isomorphism was recorded, and checked if it had already been established?

pckroon commented 3 weeks ago

We need the isomorphism between the residue in the PDB file (and those don't need to be the same ALA every time), and the block specified by the residue name. All that mutate does is change the assigned residue name. For the isomorphism we also need symmetry information, and this we do cache. You could argue that you can make the assumption that your PDB file is internally consistent. Under that assumption, you could use the last-found isomorphism (per residue type) to reorder nodes in the residue to make the isomorphism more "happy path". This is somewhere around and before repair_graph.py:270. Currently we sort the nodes by matching atom names so we can guarantee that the isomorphism we find is the one where most atom names match between the PDB residue and the reference Block. I'm not sure how bad it is to give up on that promise :) Input welcome!

csbrasnett commented 3 weeks ago

Thanks! I'll need to spend some more time trying to understand the code better before anything more substantial.

For context, this arose because I specified -mutate A/B-ARGX:LYS for 6 residues on each unit of a 200 residue dimer. I think in the end took martinize2 about 2/3 hours to complete, which I don't think is ideal! From my understanding of the code atm, I don't get why this isn't stored in the symmetry_cache as the residues get looped over?

pckroon commented 3 weeks ago

It does sound like there's room for improvement there ;) symmetry_cache doesn't store the isomorphism, just the internal symmetries of residue(types); the automorphisms, if you will