Closed rafwiewiora closed 8 years ago
What does the script need to run and how is it invoked? I can try to add a travis branch to enable testing of this code.
It needs the files/ folder as committed, AmberTools / Amber - needs the FF files and tleap, $AMBERHOME must be set, tleap available from OS command line (PATH must be set).
Executable by python conversion_script.py
.
FWIW, I use ambermini
with the ParmEd testing suite and go ahead and just set AMBERHOME to the miniconda prefix -- see here
I like this approach because this is how Amber behaves "in the wild" (which means it's more closely testing the scenario where Amber is installed by the user). But ParmEd is designed to work alongside Amber, and it interacts more closely with more components of Amber.
Provenance layout now updated to its final form, added a Test key to the yaml to indicate which tests should be performed on the particular leaprc conversion.
Can we have ParmEd check if ambermini is installed and get AMBERHOME from there? There is no reason this wouldn't be sufficient for our purposes.
Can we have ParmEd check if ambermini is installed and get AMBERHOME from there? There is no reason this wouldn't be sufficient for our purposes.
Sorry, I deleted that comment the moment I wrote it and realized I was wrong! I simply added parmed.amber.AMBERHOME = AMBERHOME
and we're all good.
Ok, we have most forcefields converted at this point.
Note that I needed to add a try, except AssertionError handling to allow the impropers tolerance to go up to 2e-2, this was needed for ff03ua to pass.
Working on nucleic acid tests now.
Can we capture the errors from the testing into an output file? This could be very useful in documenting the validation procedure.
Definitely! I'm going to add a log file functionality.
Something computer-readable might also be good. We could make a table of the validation.
I like that. Maybe write out to a YAML then?
Whatever is convenient. CSV, YAML, XML, pickle...
Alright!
Quick question: what is more preferable for energy validation:
For now I have gone with the second option, because it seems more reproducible. But an advantage the first option has is not having to worry about getting the hydrogens right for all FFs - the only case this has been important for is the united-atom forcefield though. With option one I can use the same PDB for explicit atom and united-atom FFs, with option two I need separate PDBs for united-atom.
What do you think?
You're interested in UA FFs from Amber? Nobody has worked on those in >a decade...
I'd personally go with hydrogens and then strip them out using ParmEd for UA FFs where they're not needed.
You're interested in UA FFs from Amber? Nobody has worked on those in >a decade...
Honestly it takes me less time to add a few lines to the script and convert everything, then worry about what's being used and what is not at this stage. You guys can decide further on which FFs to PR into OpenMM, but I want capability to convert everything. (within reason, I'm not touching those GLYCAMs).
I'd personally go with hydrogens and then strip them out using ParmEd for UA FFs where they're not needed.
Thanks!
Fair enough :smiley:
Another quick question! I'm using 4RZN for DNA validation, cleaned it up with PDBFixer. Works fine except for older FFs using all_nucleic94.lib, which has atom naming slightly different to the fresher nucleic lib's / what comes with the downloaded PDB / what PDBFixer & OpenMM output.
Is there a tool out there to do the new names ---> old names conversion?
Is there a tool out there to do the new names ---> old names conversion?
Not that I know of... But if you download 4RZN from the PDB, it has PDB 3 naming (which will work with nucleic10.lib, but not all_nucleic94.lib).
You could probably generate a mapping yourself just by comparing those two lib files, actually...
So I just reversed the old --> new mapping from leaprc.ff14SB to new --> old and added it:
addPdbResMap {
{ 0 "DG" "DG5" } { 1 "DG" "DG3" }
{ 0 "DA" "DA5" } { 1 "DA" "DA3" }
{ 0 "DC" "DC5" } { 1 "DC" "DC3" }
{ 0 "DT" "DT5" } { 1 "DT" "DT3" }
}
addPdbAtomMap {
{ "H1'" "H1*" }
{ "H2'" "H2'1" }
{ "H2''" "H2'2" }
{ "H3'" "H3*" }
{ "H4'" "H4*" }
{ "H5'" "H5'1" }
{ "H5''" "H5'2" }
{ "HO2'" "HO'2" }
{ "HO5'" "H5T" }
{ "HO3'" "H3T" }
{ "OP1" "O1P" }
{ "OP2" "O2P" }
}
to the LeAP script before the source
call, so this can get overwritten by the map calls in the new leaprcs. Works well!
Interestingly some of these leaprc's add the terminal e.g 0 G --> DG5, 1 G --> DG3 mappings with the assumption that any unspecified nucleotide is DNA, but they forget about 0 DG --> DG5, 1 DG --> DG3 etc.
DNA and RNA testing added, improper testing for those turned off for now, pending https://github.com/choderalab/openmm/issues/9
Closing to open a fresh one.
Tagging @jchodera @swails @peastman
Here's what I've got so far. Conversion script, testing protein only so far on two systems. Works all good on a selection of leaprc's so far - the uncommented ones in the YAML, output for you to check out in ffxml/.
Working on:
How does it look?