openmm / openmm-ml

High level API for using machine learning models in OpenMM simulations
Other
75 stars 25 forks source link

Generate Protein - Ligand System #81

Open entropybit opened 1 month ago

entropybit commented 1 month ago

Hi,

I am wondering what would be a clean way to create a protein ligand system where the ligand is parametrized with one of the MLFFs ?

In the example it is shown how to create a mixed system but this requires already having a topology + system parametrized without the MLFF. Currently I am trying to do this by just creating a mixed system using one of the small molecule FFs together with e.g. amber. So I get a system where my Protein is parametrized with amber and the Ligand with the small molecule FF. I can then use this system and topology together with a list of atom ids for the ligand to use the mixed system functionality from openmm-ml but this seems very messy.

Is there a better way, or am I maybe doing something horribly wrong in this approach to start with ?

JMorado commented 1 month ago

@entropybit, you need a topology plus an OpenMM system at the MM level in order to create a mixed ML/MM system using openmm-ml. Also, since openmm-ml currently only supports mechanical embedding schemes, remember that the small molecule FF ought to have non-bonded parameters consistent with whatever protein FF you are using. Unless I'm missing something here, I think what you're doing is logical.

When you say this procedure seems very messy, what exactly do you mean? Is there any way you envision it could be made better?

entropybit commented 1 month ago

Well, it would be nice to be able to directly create the mixed system with any protein FF and the MLFF similiar to the way a small molecule force field is added:

espaloma = EspalomaTemplateGenerator(molecules=lig_mol, forcefield='espaloma-0.3.2')
...
forcefield = ForceField('amber/protein.ff14SB.xml', 'amber/tip3p_standard.xml', 'amber/tip3p_HFE_multivalent.xml')
forcefield.registerTemplateGenerator(espaloma)

modeller = app.Modeller(complex_topology, complex_positions)
modeller.addSolvent(
    forcefield, model=config.water_model, 
    padding=config.solvent_padding, 
    ionicStrength=config.ionic_strength,
    residueTemplates=forcefield._templates
)
# Get topology and position
solvated_topology = modeller.getTopology()
solvated_positions = modeller.getPositions()
# Create system
solvated_system = forcefield.createSystem(
    solvated_topology,
    removeCMMotion = True, 
    ewaldErrorTolerance = config.pme_tol, 
    constraints = config.constraints, 
    rigidWater = True, 
    hydrogenMass = config.hmass,
    nonbondedMethod = config.nonbonded_method
)

I guess its not easily possible to also define a TemplateGenerator ?

peastman commented 1 month ago

In the mixed system, the internal energy of the ligand is computed with the ML potential, but the interaction between the ligand and everything else is computed with the conventional force field. That's why you need to parameterize everything, including the ligand, with the force field. Or am I misunderstanding what you're asking?

entropybit commented 1 month ago

@peastman No you are right, I just did not think about this.... Does it even make sense to do this ?

It would be better to also have the MLFF for the interactions between Ligand and Protein too. But I can see that this is not easily done. Unless the ML part is used to generate parameters in the language of an classical FF, like Espaloma does. Mixing rules can not really be applied to the MLFF part right ?

peastman commented 1 month ago

Right. Different ML potentials work in different ways, but most often they just throw all the atom positions into a neural network and produce an energy for each atom at the end. There's no way to separate that energy into different components, such as the part that came from the interaction between two molecules.