Open pschmidtke opened 6 years ago
Morning Peter,
I think I know why you have exception caught with this CCCCn1c(Cc2cc(OC)c(OC)c(OC)c2Cl)nc2c(N)ncnc12
and the new offxml
file. I think it is actually to do with the aromaticity model.
quoting the problematic region you saw:
Topological atom sets not assigned parameters: (4, 5) : 0 mymol 4 0 mymol 5 (4, 27) : 0 mymol 4 0 mymol 27
and this figure:
This is the hetero-ring which the openeye MDL aromaticity model does not recognize as aromatic while the rdkit (which I guess you are using 2017.03 or before?) does think it is aromatic. Sifting through the new offxml
file there does not seem to be bond smirks for tertiary aromatic nitrogens and hence the error.
I have been looking at the newest rdkit version which implements the MDL aromaticity model.
but I had some issues which I reported here and hopefully this has been resolved in the most recent beta-trial version of rdkit 2018.03 (they have the conda install-able out last week if you wish to give it a try, conda install -c rdkit/label/beta rdkit
, although I don't think it will directly solve the problem). I have a WIP version on my laptop which I can try to work a bit more on. Unfortunately it is kind of a bad timing for me as I am set to go on holiday later today for two weeks. I can work on it between travels but I cannot promise anything, sorry.
@davidlmobley
thanks @hjuinj . Seems logical, as I observed the weird geometries (with previous ffxml) either on tertiary aromatic nitrogens, but also a few others (but usually nitrogens). No hurry, I also have real work ;) I can continue to set up my things on systems without these types of atoms.
Thanks for this, @pschmidtke . I'll revisit in more detail soon, but just to respond to the SMILES/aromaticity model: Definitely if you're using the older version of RDKit the aromaticity model means something very different and you'll get a lot of discrepancies in substructure matches. Shuzhe (@hjuinj ) put in a great deal of work into (a) tracking these down, and (b) working with the RDKit developers to get a comparable model put into place for the 2013 beta. I think ultimately he was able to get all the energies (cross-comparing between OE and RDKit implementations) to agree, but that presumably requires the WIP stuff he has on his laptop. :)
We're glad it looks like this will be able to be useful to you.
One thing which might or might not be useful to you is that we've actually done all the hydration free energy calculations in the FreeSolv database again with SMIRNOFF, using water, obviously (see our preprint on biorxiv), and scripts for this are available online. You may find this one useful: https://github.com/MobleyLab/SMIRNOFF_paper_code/blob/master/FreeSolv/scripts/create_input_files.py -- at least, it shows one particular workflow that will successfully get you OpenMM input files for a set of "solute in water" systems. It DOES rely on (open source) packmol for adding water molecules, which may not be as good as adding things to a pre-equilibrated box. But it does also work reliably/robustly (642 hydration free energy calculations run, a couple of times!). :)
Hey, thanks for the info. As I said no pressure @hjuinj ;) The time being I can continue my integration work with the ligands I can parametrize, so that should be good for a proof of concept. The solvation example will definitely be of interest, i'll check that out and ping you if I run into issues on that. Thanks again for your guidance.
@pschmidtke : Is this issue still relevant now that we have the toolkit 0.2 release featuring RDKit support?
Hi,
as requested by @davidlmobley here (https://github.com/openforcefield/openforcefield/issues/28#issuecomment-382541238) I stop hijacking other issues and create a dedicated one to outline what I'm trying to achieve using openforcefield.
As a start I want to set up a proper ligand protein simulation using only the rdkit integration started by @hjuinj . So the current short term things I'm testing are:
On the mid-term, if this first test integration works, i'll also integrate a protein chunk creation + harmonic restraints into the simulation as described here : https://www.nature.com/articles/nchem.2660. Most of this is already done but not tested with parmed/openforcefield prepared systems yet.
On the long run, I'd be happy to simulate systems only prepared via openforcefield (also the protein)...so this would require a translation of the protein force fields to offxml.
I'll push my work for now here (https://github.com/pschmidtke/openforcefield) into the pschmidtke_rdk branch, as I don't have access to this repo here. Note that I also do not have an openeye license and that the whole point of this integration is to provide a fully free version of our dynamic undocking approach (I currently rely on MOE to parametrize the ligands using.....parm@frosst ;) ).
Also, I'm a noob in openmm, am still discovering a lot (and I like it a lot) and will probably sometimes ask noob questions, like where is what or how to do the most basic thing in the world, sorry for that :)
I'll post issues related to the offxml or other openforcefield related things here.
Thanks in advance for your help!