Closed danielparton closed 9 years ago
Doesn't simtk.openmm.app.Modeller
automatically add disulfide bonds by distance? Do we use that in the pipeline?
I thought that was done by the addHydrogens
routine (a member function of simtk.openmm.app.Modeller
). That accepts a variants=
argument, which I was using to keep the residue variants consistent with a reference model.
Unless the disulfide bond is determined when first making the Modeller
object, and the addHydrogens
routine simply adds hydrogens based on whether or not a disulfide bond is already present in the Modeller
object?
Checking now..
Ok, so the disulfide bond is indeed defined by distance when initializing the Modeller
object. The list of bonds in the topology then determines which protonation states are assigned by the addHydrogens
member function.
So I'll have to change the code to use the bonds data to keep disulfide bonds consistent across models.
I think there should only be a few TK targets affected by this, but I'll need to redo implicit solvent MD for them.
I'm hoping I can just copy the ._bonds list from the reference topology to all models.
Ok, so the disulfide bond is indeed defined by distance when initializing the Modeller object.
Can we also report this to the OpenMM issue tracker as a behavior we would like some way to control?
Will do. I've implemented a workaround in Ensembler for now, which seems to be working.
Actually, turns out the disulfide bond is first defined when making the app.PDBFile
object, which is then used to build the app.Modeller
object.
So there is a simple and non-hacky way to tackle this by storing the app.PDBFile.topology
object for the reference structure, and using that to make the app.Modeller
object for each model.
Awesome!
How do we choose the reference structure, and are we sure auto detecting disulfide bonds is the right thing to do?
I'm not actually sure if intracellular kinase domains would ever have disulfide bonds due to the reducing intracellular environment.
Maybe we want to have two options? Either "automatic" (use reference structure to determine which disulfide bonds to preserve) or "reduce" (no disulfide bonds)?
During the implicit solvent MD stage, models should be given the same protonation states as a reference model (the model with highest template-target sequence identity).
However, the implicit solvent MD models for the TK target FAK2_HUMAN_D0 have different topologies:
I'm looking into this now, but it's not immediately clear why this happened. They were generated during the same ensembler run, so really should have been using the same residue variants.