ParmEd / ParmEd

Parameter/topology editor and molecular simulator
https://parmed.github.io/ParmEd
394 stars 148 forks source link

Gromacs to Amber #214

Closed gchevrot closed 9 years ago

gchevrot commented 9 years ago

I tried to convert a Gromacs topology in the Amber format. (see convert_gmx2amber.ipynb ) The file _amberparm.py raises an exception (line 1935) because rref is not equal to pair.type.rmin. I cannot figure out how I can resolve this problem?

swails commented 9 years ago

Amber and Gromacs handle nonbonded exceptions (i.e., particular pairs of atoms that behave differently than "normal" pairs of atoms with the same two types would behave). In the Amber force fields, the exceptions are very simple and occur only between 1-4 atom pairs (i.e., atoms separated by 3 bonds). The van der Waals interactions are divided by 2 and the electrostatic interactions are divided by 1.2.

Gromacs, on the other hand, supports arbitrary exception rules for van der Waals (and supports scaling of the electrostatic interactions between the pair exception). So the Amber prmtop file can only identically represent exception pair parameters in which the combined epsilons are divided by some constant -- the sigma (or Rmin) must be the same for the "normal" pair and the exception pair. If this is not the case (as it's not here), then it's impossible for the Amber prmtop file to define the same potential energy function as the Gromacs topology file. So an exception is thrown (as you've seen).

Now Amber does actually support this level of freedom in nonbonded exception parameters, but only through the addition of CHARMM force field support (where 1-4 van der Waals parameters can be explicitly different). But this is done through the ChamberParm class rather than the AmberParm class (since it's really a different kind of Amber topology file). However, it is quite complicated to correctly implement this feature and keep the atom type tables sufficiently small enough to fit in the shared memory of GPUs (which will destroy pmemd.cuda performance). So I haven't tried adding support for truly arbitrary exclusions (through the chamber-style topology file) yet. If you send me your topology and GRO files, I will see what I can do about adding support for this conversion.

gchevrot commented 9 years ago

Thank you very much for this really clear answer. You can find all the necessary file here

swails commented 9 years ago

@gchevrot -- this actually turned out to be a little easier than I expected. PR #218 implements this functionality, and I was able to use it to generate a topology file that sander accepted and computed an energy for.

It appears to give the same energy as the Gromacs topology file (admittedly, this was tested with OpenMM, but both the Amber and Gromacs file objects have been rather extensively tested between OpenMM and Amber and OpenMM and Gromacs so the Amber<->Gromacs comparison should be satisfied).

The trick is that you need to use ChamberParm instead of AmberParm as the base Amber object, since the former implements the "tricks" necessary to get Amber to recognize the CHARMM-specific parameter types (like urey-bradleys, impropers, CMAPs, and unique 1-4 vdW parameters).

Feel free to test that PR (I'm not going to merge it until it passes the continuous integration tests), and let me know of any problems you encounter.

hainm commented 9 years ago

@swails I've tried to load https://github.com/gchevrot/Kinesin/blob/master/nvt.gro file from @gchevrot dir with parmed. There is no atomic_number info. Do you intend to introduce option to guess it?

In [12]: In [11]: p = pmd.load_file("./nvt.gro")

In [13]: p.atoms[0].atomic_number
Out[13]: 0

In [14]: p.atoms[1].atomic_number
Out[14]: 0
In [15]: p
Out[15]: <Structure 88812 atoms; 28161 residues; 0 bonds; PBC; NOT parametrized>
hainm commented 9 years ago
In [7]: p
Out[7]: <Structure 88812 atoms; 28161 residues; 0 bonds; PBC; NOT parametrized>

In [8]: p.write_psf("test.psf")
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-e24e1ff1b1cc> in <module>()
----> 1 p.write_psf("test.psf")

/home/haichit/amber_git/amber/lib/python3.4/site-packages/parmed/formats/psf.py in write(struct, dest, vmd)
    210         dest.write('\n')
    211         # Group section
--> 212         dest.write((intfmt*2) % (len(struct.groups), struct.groups.nst2))
    213         dest.write(' !NGRP NST2\n')
    214         for i, gp in enumerate(struct.groups):

AttributeError: 'TrackedList' object has no attribute 'nst2'

should have another error message? Like don't support write to psf?

hainm commented 9 years ago

and do you intend to add 'p.write_parmwhen reading fromgro` file? currently I only see this

In [13]: p.w
p.write_cif  p.write_pdb  p.write_psf
swails commented 9 years ago

and do you intend to add p.write_parm when reading from gro file?

No. write_parm will only ever be attached to AmberParm and subclasses. I do, however, plan on adding a save attribute to Structure that will pick the file format based on filename extension (or take a format argument) and have that function cast to Amber prmtop if that's the format users want to write to.

PSF, PDB, CIF, and GRO writing all require only that information present in any Structure instance. Amber topology files require quite a bit more information (which is why the from_structure methods on those classes are so complex).

Also, GRO files don't contain nearly enough information to make an Amber prmtop. And there is little use in generating a prmtop strictly for analysis (i.e., devoid of any parameters or valence terms), since any Amber analysis programs that read prmtop files as topologies read almost any other format equally well for such purposes.

swails commented 9 years ago

should have another error message? Like don't support write to psf?

This is a bug that should be easily fixed. Thanks for the report!

swails commented 9 years ago

There is no atomic_number info. Do you intend to introduce option to guess it?

I don't think so for the GRO file. Most other file formats do set the atomic number correctly, or at the very least give a reasonable guess. But those file formats usually contain very good hints about what the elements are. The PDB and CIF files give the element directly (according to the spec -- the atom name is used to guess it if those columns are not used). Other formats typically give either the atomic number directly or the atomic mass, which can identify the element with very good precision.

By contrast, the GRO file would only provide element hints within the atom name, which I personally don't trust. Also, GRO files are not that frequently used on their own, in my experience. They're loaded alongside Gromacs topology files, for which element information is readily available. So as long as you load Gromacs topology files in concert with GRO files, you will have the element information you need.

hainm commented 9 years ago

thanks. I asked about atomic number to make a guess about mass (like COM). "gro" file above seems like a pdb file, no mass, no charge.

hainm commented 9 years ago

I mean like a simple pdb file.

swails commented 9 years ago

That's reasonable. I'll try to add atomic number guessing to GRO files.

It's still better to use the Gromacs topology file for that kind of thing. But sometimes I admit just using the GRO file is convenient.

swails commented 9 years ago

@gchevrot -- this should be fixed in the master branch now. Instead of using AmberParm.load_from_structure, use ChamberParm.from_structure. You can also load the Gromacs topology and GRO files in the same object to make sure that the unit cell information gets transferred properly.

So your Python script will look something like this:

import parmed as pmd

gro = pmd.load_file('topol.top', xyz='nvt.gro')
amb = pmd.amber.ChamberParm.from_structure(gro)
amb.write_parm('amber_system.parm7')
amb.write_rst7('amber_system.rst7')

This produced Amber-compatible files for me. Let me know if you have any problems (it did take a minute or two given the size of the system). Let me know if you have problems.

gchevrot commented 9 years ago

@swails -- Thank you very much, it works. I have just noticed that VMD complains about the 2nd line of the parm7 file because it does not know "CTITLE". Then the solution is to replace "CTITLE" with "TITLE".

swails commented 9 years ago

Then the solution is to replace "CTITLE" with "TITLE".

This won't work. Chamber topology files are known not to work with VMD, since VMD doesn't actually respect the topology file specification. In particular, Chamber topology files specify charges to a different precision than the standard Amber topology files (in order to ensure agreement with CHARMM results to machine precision). So even if you change CTITLE to TITLE, VMD will still fail to read it. And changing CTITLE to TITLE will actually break the topology file with sander/pmemd (but it won't actually fail! It will still give you reasonable-looking results, I think).

CTITLE was introduced specifically to differentiate between Amber-style and CHARMM-style topology files. So if you make sander and pmemd think that the topology file is an Amber-style topology file, it will ignore the impropers, cmaps, and special 1-4 van der Waals parameters, which would completely invalidate your simulations.

There are two solutions to this. First is to just use the Gromacs topology file (or GRO file) as the topology file with VMD (this is my personal preference). An alternative is to use cpptraj to remove the chamber information from the topology file. Just note that if you use that new topology file for simulations, the above caveats apply.

To use cpptraj, the following script should work:

parm amber_system.parm7
parmwrite out amber_system_vmd.parm7 nochamber