Becksteinlab / MDPOW

Calculation of water/solvent partition coefficients with Gromacs.
https://mdpow.readthedocs.io
GNU General Public License v3.0
25 stars 10 forks source link

add M24 water model for testing #46

Closed orbeckst closed 8 years ago

orbeckst commented 8 years ago

Aim

For testing purposes we should add the M24 water model, introduced by Shirts and Pande1.

Background

Shirts1 introduced a range of proof-of-concept TIP3P-like water models. He changed the epsilon and sigma parameters and obtained much better hydration free energies for amino acid sidechain analogs than with any of the regular models. The one termed M24 performed best. These M* models are not meant for production simulations but are proof-of-concept that fixed charge models and in particular water models still have a lot of room in the parametrization.

M24 water model

Need to add various entries to the bundled force field files

iorga commented 8 years ago

Without noticing the header of this issue (thanks @orbeckst for the detailed description) I did the same modifications to my local install of Gromacs.

iorga@p54: > cat /sge/packs/packs_centos/opt/gromacs-5.1.3-openmpi-2.0.0-fftw-auto-centos6-pipin/share/gromacs/top/oplsaa.ff/watermodels.dat 
tip4p   TIP4P  TIP 4-point, recommended
tip4pew TIP4PEW TIP 4-point with Ewald
tip3p   TIP3P  TIP 3-point
tip3p-m24   TIP3P-M24  TIP 3-point with M24 modification
tip5p   TIP5P  TIP 5-point (see http://redmine.gromacs.org/issues/1348 for issues)
tip5pe  TIP5P  TIP 5-point improved for Ewald sums
spc     SPC    simple point charge
spce    SPC/E  extended simple point charge
iorga@p54: > cat /sge/packs/packs_centos/opt/gromacs-5.1.3-openmpi-2.0.0-fftw-auto-centos6-pipin/share/gromacs/top/oplsaa.ff/tip3p-m24.itp 
[ atomtypes ]
; name  bond_type    mass    charge   ptype          sigma      epsilon
 opls_111   OW  8      9.95140    -0.834       A    3.11100e-01  1.00440e-00

[ moleculetype ]
; molname       nrexcl
SOL             2

[ atoms ]
; id    at type res nr  residu name     at name         cg nr   charge
1     opls_111  1       SOL              OW             1       -0.834
2     opls_112  1       SOL             HW1             1        0.417
3     opls_112  1       SOL             HW2             1        0.417

#ifndef FLEXIBLE
[ settles ]
; i     j       funct   length
1       1       0.09572 0.15139

[ exclusions ]
1       2       3
2       1       3
3       1       2
#else
[ bonds ]
; i     j       funct   length  force.c.
1       2       1       0.09572 502416.0 0.09572        502416.0 
1       3       1       0.09572 502416.0 0.09572        502416.0 

[ angles ]
; i     j       k       funct   angle   force.c.
2       1       3       1       104.52  628.02  104.52  628.02  
#endif

However, when executing the command gmx pdb2gmx -f spc900.gro -o tip3p-m24-900.gro -p tip3p-m24-900.top -ff oplsaa -water tip3p-m24 I've got the following error:

-------------------------------------------------------
Program:     gmx pdb2gmx, VERSION 5.1.3
Source file: src/gromacs/commandline/cmdlineparser.cpp (line 234)
Function:    void gmx::CommandLineParser::parse(int*, char**)

Error in user input:
Invalid command-line options
  In command-line option -water
    Invalid value: tip3p-m24

So it seems that the accepted values of the -water option are not taken from watermodels.dat, but are hard coded. I couldn't find explicitly in the code these values, so probably they are taken from watermodels.dat at the compilation time.

We can conclude that adding a new water model is not easy, it is not generalizable (it needs a specific compilation). I've made a number of trials and the easiest was to temporarily replace the content of tip3p.itp with those of tip3p-m24.itp (but keeping the same name that is recognized by Gromacs). Now the sims with the TIP3P-M24 water model on the dataset92 are running and I will keep you updated with the results.

If this water model gives good results, we might want to use it for our free energy calculations for ligands, but in this case it should be implemented not as a water model, but as a new solvent like octanol and cyclohexane. I know that this is not convenient, but I don't see another solution due to the above-mentioned limitation of Gromacs.

orbeckst commented 8 years ago

According to the Gromacs 5.1.2 code in pdb2top.h (and pdb2top.cpp), it should be possible to select the water model via watermodels.dat:

void choose_watermodel(const char *wmsel, const char *ffdir,
                       char **watermodel);
/* Choose, possibly interactively, which water model to include,
 * based on the wmsel command line option choice and watermodels.dat
 * in ffdir.
 */

It definitely reads the "watermodels.dat" file.

Not sure why it's not working for you.

orbeckst commented 8 years ago

Actually, the choose_watermodel() function is not really important: it just makes the model available for manual selection (now reading pdb2gmx.c more carefully...).

Resname selection

ITP file

orbeckst commented 8 years ago

@iorga I think the problem in your https://github.com/Becksteinlab/MDPOW/issues/46#issuecomment-239363478 is rather that you have a dash in your name and the stupid Gromacs commandline parser cannot deal with it. Try replacing with an underscore... or better, just name the water model "m24" (my preferred solution).

iorga commented 8 years ago

No, I've tried this just after posting the comment, by removing the dash in the .itp file name and in 'watermodels.dat' ("tip3pm24") and the result was the same. I will keep trying...

orbeckst commented 8 years ago

On 11 Aug, 2016, at 23:42, Bogdan I. Iorga notifications@github.com wrote:

No, I've tried this just after posting the comment, by removing the dash in the .itp file name and in 'watermodels.dat' ("tip3pm24") and the result was the same. I will keep trying...

Weird. Will need to look at the Gromacs code.

Btw, I don't think that MDPOW actually uses pdb2gmx anywhere so it shouldn't be a problem for us.

Oliver Beckstein * orbeckst@gmx.net skype: orbeckst * orbeckst@gmail.com

iorga commented 8 years ago

I have also tried with -water tip4pew, which is natively in watermodels.dat and I've got the same error.

When added the m24 option in share/top/oplsaa.ff/watermodels.dat, src/programs/completion/gmx-completion.bash and src/gromacs/gmxpreprocess/pdb2gmx.c, and m24.itp in share/gromacs/top/oplsaa.ff/ then recompiled Gromacs, the command gmx pdb2gmx -water m24 works without error, as expected.

Another argument for the necessity of compilation with the new option is in the file src/gromacs/gmxpreprocess/pdb2gmx.c:

    /* Command line arguments must be static */
...
    static const char *watstr[]       = { NULL, "select", "none", "spc", "spce", "tip3p", "tip4p", "tip5p", NULL };

@orbeckst, I think that now this part is quite clear. I will try now to implement m24 directly in MDPOW, as you suggested.

iorga commented 8 years ago

I made the modifications of the code on a new branch (TIP3P-M24) and committed (aeda040). I am waiting for the M24 sims of the dataset92 to be finished before to install the new version and test these modifications.

orbeckst commented 8 years ago

You can make it a pull request and then it can be easily evaluated and amended, if necessary.

Switch to the new TIP3P-M24 branch, then select New pull request (against ''develop''), add some explanations, reference this issue #46, and submit.

iorga commented 8 years ago

It seems that we were too quick in adding the m24 water without a proper validation in MDPOW. The m24.itp file works for me with pdb2gmx, but in MDPOW it gives an error about the order of atomtypes. Probably the order of include statements in MDPOW's .top file is not the same. I will check.

On the other hand, the simulations that were intended to be with m24 are in reality with tip3p. When I started them I didn't know about the MDPOW's mechanism of including .itp files. Anyway, the reproducibility is extremely good for the whole dataset92:

iorga commented 8 years ago

The issue was fixed by inverting the order of includes for compound and solvent .itp files in the system.top template file. Now it is confirmed that the m24 water model works with the compounds from the dataset92.

iorga commented 8 years ago

Initial validation of the TIP3P-M24 water model (trying to reproduce the density values reported by Shirts and Pande)

gmx solvate -cs spc216.gro -box 3.1 3.1 3.1 -maxsol 900 -o spc900.gro
gmx pdb2gmx -f spc900.gro -o tip3p900.gro -p tip3p900.top -ff oplsaa -water tip3p
gmx grompp -f tip3p900_NPT_opls.mdp -c tip3p900.gro -p tip3p900.top -o tip3p900.tpr
gmx mdrun -deffnm tip3p900 -ntomp 16 &
echo 15 | gmx energy -f tip3p900.edr -o tip3p900_density.xvg > density_tip3p_5ns.txt
echo 15 | gmx energy -f tip3p900.edr -o tip3p900_density.xvg -b 1000 > density_tip3p_last4ns.txt

The box dimensions are the smallest that can accommodate 900 water molecules. The initial density is about 0.9 g/ml, which converges quickly (7 ps) to about 0.98 g/ml.

The average density for the whole simulation is:

cat density_tip3p_5ns.txt

Statistics over 2500001 steps [ 0.0000 through 5000.0000 ps ], 1 data sets
All statistics are over 50001 points

Energy                      Average   Err.Est.       RMSD  Tot-Drift
-------------------------------------------------------------------------------
Density                     984.617       0.14    5.81243  -0.110855  (kg/m^3)

And for the last 4 ns:

cat density_tip3p_5ns.txt

Statistics over 2500001 steps [ 0.0000 through 5000.0000 ps ], 1 data sets
All statistics are over 50001 points

Energy                      Average   Err.Est.       RMSD  Tot-Drift
-------------------------------------------------------------------------------
Density                     984.617       0.14    5.81243  -0.110855  (kg/m^3)
iorga@pipin:/gem/iorga/projects/sampl5_sims/TIP3P-M24/TIP3P > cat density_tip3p_last4ns.txt

Statistics over 2000001 steps [ 1000.0000 through 5000.0000 ps ], 1 data sets
All statistics are over 40001 points

Energy                      Average   Err.Est.       RMSD  Tot-Drift
-------------------------------------------------------------------------------
Density                     984.682       0.12    5.34347   -0.60855  (kg/m^3)

These values compare well with those reported by Shirts and Pande for the same system, 0.9859 g/ml.

gmx solvate -cs BLK216.gro -box 3.1 3.1 3.1 -maxsol 900 -o BLK900.gro
# generate manually BLK900.top from tip3p900.top
gmx grompp -f BLK900_NPT_opls.mdp -c BLK900.gro -p BLK900.top -o BLK900.tpr -maxwarn 1
gmx mdrun -deffnm BLK900 -ntomp 16 &
echo 15 | gmx energy -f BLK900.edr -o BLK900_density.xvg > density_m24_5ns.txt
echo 15 | gmx energy -f BLK900.edr -o BLK900_density.xvg -b 1000 > density_m24_last4ns.txt

The average density for the whole simulation is:

cat density_m24_5ns.txt 

Statistics over 2500001 steps [ 0.0000 through 5000.0000 ps ], 1 data sets
All statistics are over 50001 points

Energy                      Average   Err.Est.       RMSD  Tot-Drift
-------------------------------------------------------------------------------
Density                     995.963       0.17     5.6356  0.0244999  (kg/m^3)

And for the last 4 ns:

cat density_m24_last4ns.txt 

Statistics over 2000001 steps [ 1000.0000 through 5000.0000 ps ], 1 data sets
All statistics are over 40001 points

Energy                      Average   Err.Est.       RMSD  Tot-Drift
-------------------------------------------------------------------------------
Density                     996.066       0.16    4.69872  -0.826938  (kg/m^3)

These values also compare quite well with those reported by Shirts and Pande for the same system, 0.9976 g/ml.

iorga commented 8 years ago

Issue #46 solved via 1af086e and closed.

orbeckst commented 8 years ago

See my comments:

We should move [ atomtypes ] section out of the m24.itp files and integrate properly in the forcefield. Then leave the canonical "compound / water model / ions" order in the template file.

orbeckst commented 8 years ago

Basically, add a new atomtype for the oxygen, as in my original check list at the top of this issue.

iorga commented 8 years ago

I made the changes as suggested (see commit d42b26a). I will wait until I can validate this new modification on at least one compound, before making a new pull request.

iorga commented 8 years ago

I can confirm now that the new modification works well in a standard simulation with the m24 water.

@orbeckst, we can merge this in the develop branch whenever you want.

orbeckst commented 8 years ago

Which PR is this? Just assign the PR to me, I review, and then merge.

iorga commented 8 years ago

I will wait for PR #56 (revert previous changes) to be completed before introducing a new pull request with the latest changes from here (atom name opls_111_m24).