MobleyLab / benchmarksets

Benchmark sets for binding free energy calculations: Perpetual review paper, discussion, datasets, and standards
BSD 3-Clause "New" or "Revised" License
42 stars 16 forks source link

Errors using cyclic host `mol2` files with non-unique atom names #64

Closed slochower closed 6 years ago

slochower commented 6 years ago

I'm dropping a note here to mention that reading in cyclic molecules with non-unique atom names can cause an error with ff.createSystem as detailed here. I'll update this issue once we decide on a robust solution.

andrrizzi commented 6 years ago

Just adding a note that this issue is related to #60.

slochower commented 6 years ago

Right, thanks. Meant to tag that issue as well.

davidlmobley commented 6 years ago

Note that I think if you have such a molecule, you should be able to generate unique atom names by running OETriposAtomNames on it. This is not really a solution (though perhaps it's a work-around).

Probably the best solution is the new Topology object we're starting to implement, which will allow us to not depend on atom/residue names in constructing the topology and the System (cc @jchodera ) I believe.

slochower commented 6 years ago

Note that I think if you have such a molecule, you should be able to generate unique atom names by running OETriposAtomNames on it.

No such luck. I had actually omitted OETriposAtomNames(mol) previously and only ran OEDeterminConnectivity(mol) and OEPerceiveBondOrders(mol), but I just checked with OETriposAtomNames(mol) and the error persists. HEre is the snippet:

ifs = oechem.oemolistream(path + 'bcd-no-conect.xyz')
for mol in ifs.GetOEGraphMols():
    oechem.OEDetermineConnectivity(mol)
    oechem.OEPerceiveBondOrders(mol)
    # Replace existing cyclodextrin
    oechem.OETriposAtomNames(mol)
    mols[0] = oechem.OEMol(mol)

the best solution is the new Topology object we're starting to implement

Based on OpenMM topology or something else?

davidlmobley commented 6 years ago

Check out #86 ; feel free to comment. It would be a new Topology object. (The OpenMM one is limited.) This would fix several issues:

slochower commented 6 years ago

I want to drop a note to say that I can build a host from SMILES and then write AMBER topology. There are still a few other thorns, but before I forgot, I wanted to say that ParmEd can successfully write prmtop files for smirnoff99Frosst from an OpenMM system built from SMILES for cyclic molecules whereas it fails when the starting point is a mol2 file. I can provide additional details.

davidlmobley commented 6 years ago

@slochower - it's excellent that you can get it to work, though odd that it fails from a mol2 file.

Have you checked what happens if you write your molecule built from SMILES to a mol2 file, read it in again, and then try? If this works, it would shed light on what the problem is (as you'd be able to compare the mol2 file that works with the one that doesn't) and if it doesn't work, it would highlight that there's an issue with something that happens going to mol2 format.

Sorry you had so much headache here. It's obvious that cyclic hosts are a challenging test case for some of our machinery (and by "our" I don't mean just ours, but the field's, including ours).

slochower commented 6 years ago

Have you checked what happens if you write your molecule built from SMILES to a mol2 file, read it in again, and then try?

That's a great question. First, if I write out a mol2 file after using OE to generate AM1-BCC charges, I get multiple conformations in the mol2. I haven't had time to look into it yet -- I got 13 conformations in the mol2, which is odd because I generated 200 conformations for the charges. But if I just build from SMILES, write out a mol2 and try to use those mol2 files to setup the OpenMM system [...] and try to export a pmrtop from ParmEd, I do run into the same exact ParmEd error.

slochower commented 6 years ago

Gist (and PDB file is same as previous attempts, which can be found here)

davidlmobley commented 6 years ago

Regarding number of conformations -- did you generate 200 conformations for the charges, or ask it to generate up to 200 conformations?

REgarding mol2 though, it sounds like it's some sort of processing that happens going to mol2 format (in order to conform to format specs, typically) which is causing problems here. Interesting. Probably I should bring this up with OpenEye support as they'd likely be able to help track down what's different.

slochower commented 6 years ago

I'm going to close this issue here because I think we'll address this in a more central location when we discuss other potential force field conversion issues and/or deal with an updated Topology.