requesting meaningful OPLSAA atom types in output files

jewettaij commented 5 years ago

[ ] I believe this to be a bug with LigParGen
[x] This is a feature request

Issue Information

Software name & Version : http://zarbi.chem.yale.edu/ligpargen/ Method: Not relevant (1.14*CM1A for charges, if it matters)

Expected Behavior

We would like LibParGen to report the atom type strings that it uses for force-field lookup in the files that it creates. I am specifically asking for the atom type names that are used for looking up force-field parameters in the official OPLSAA files (distributed here and here). These atoms have names like "C135", "C137", "C283", "H140" (not "CA" or "C01". Those are PDB atom-names. That's not what we need.)

I explain why this would be very useful below.

It would be great if all of the files generated by LibParGen included these atom type names, but we would be happy to settle for Charmm RTF and PRM files for now. (OpenMM/FF XML and LAMMPS would be great too.)

I'd also request the the output files include a time and date. The reason I ask this is because it is important to know which version of the OPLSAA force field was used, and this can (hopefully) be inferred from the date. (I suspect that these atom type strings might vary depending on the version of OPLSAA you are using. If so, perhaps you could include the date somewhere in the headers of the files you create? Or in the file names?)

Actual Behavior

Currently each atom is assigned to a unique (automatically generated) type name even if there are multiple copies of the same atom in the molecule.

Also, if you run LibParGen twice on two different molecules, there is a high chance that they will both contain an atom of type "C801" (for example), even though (I suspect) that atom type probably refers to something different in those two molecules. This makes it hard to combine two different molecules in the same simulation.

Benefits of revealing atom types

Allow users to combine multiple different molecules together in the same simulation without having to worry about atom type name collisions.
Reduce the number of questions you receive from users wondering why LibParGen generated a molecule with unexpected force field parameters. (I see many issues like this on the issue-tracker.) LibParGen is currently a black box. Let users look inside the box (a little bit) and answer their own questions.
Make it easier for third-party developers (such as myself) to write tools which can convert LibParGen output files into new formats.

I'm currently trying to release a tool which will convert LibParGen into moltemplate format. This will make it significantly easier for LAMMPS users to benefit from LibParGen. (Moltemplate is a molecule-builder for LAMMPS. I wrote moltemplate.) Hiding the atom type makes this task much harder.

Is there a reason atom types are hidden? Are there some atom types generated by LibParGen that do not correspond to anything that's in the OPLSAA force field files? (If that is the case, would you be willing to consider reporting the atom type names for the subset of atoms which are defined in the official OPLSAA files?)

Thanks in advance for your time and consideration.

SaraMosalla commented 5 years ago

Completely agree with what you mentioned. I hope they provide the information in more details.

gandhiforbes commented 5 years ago

Agree with the above mentioned logic. It will lead to development of many more open source codes.

SaraMosalla commented 5 years ago

Hi Andrew,

May I ask in the http://zarbi.chem.yale.edu/oplsaam.html site for nonbonded parameters why there are 6 columns of numbers. at the very bottom of the page in OPLS-AA/M Parameter file tab.

which of those are the corrected sigma and epsilon?

Thanks, in advance. Sara

SaraMosalla commented 5 years ago

Hi,

continuing my question above, why epsilons have a minus sign? Also, what is 180 and 0 in the last column of dihedrals and impropers? Would you please explain for me.

I really appreciate your help.

jewettaij commented 4 years ago

It looks like the person who created this repository has left the lab and abandoned the project. (I can't really blame them. This happens in academia.) I'm disappointed. The server is a black box. The source code is not open. So there is no way to know if the output files created by the server are reasonable. I'd be willing to help with this project, but there is no way to do that either. I'm hoping all this changes in the future.

Alternatives:

For what it's worth, here are two promising looking alternatives (for users who are willing to abandon using the OPLSAA force field):

1) OpenFF and the SMIRNOFF force field format. At first glance, this seems potentially like a good alternative way to generate force field parameters for a broad range of ligands. 2) The ATB database also seems like an excellent resource. (Crude summary: Thee ATB has fine-tuned charges and custom force field parameters for each molecule in the database, generated using DFT.)

Please pardon the ranting. Perhaps the list is helpful. (Let me know if I omitted anything.)

jewettaij commented 4 years ago

Hi,

continuing my question above, why epsilons have a minus sign? Also, what is 180 and 0 in the last column of dihedrals and impropers? Would you please explain for me.

I really appreciate your help.

I have not yet taken the time to figure out the format of the .INP files used by CHARMM, and I agree that the 6-column files are confusing. I found the GROMACS format version of these files easier to read. Try downloading the file, opening it and viewing the "ffnonbonded.itp" file. The epsilon parameters are clearly labelled and they are positive. By comparing the numbers in these files with the GROMACS files, perhaps one can figure out the file formats for both GROMACS and CHARMM. (Keep in mind that the epsilon values in the GROMACS are probably larger by a factor of 4.184, because I think they measure energy in Joules instead of kCal/mole.)

Hope this helps..

vasi786 commented 4 years ago

That is why I prefer CGenFF for CHARMM, It gives a penalty if anything is off course and has a unique atom types for an atom.

jewettaij commented 4 years ago

For what it's worth, I think I finally figured out what the extra columns are used for in the par_opls_aam.inp file. The last 2 columns are probably the Lennard-Jones parameters for "1-4" interactions.

I conclude this because in the OPLSAA force field, the non-bonded interactions between the first ("1") and last ("4") atom participating in a dihedral interaction are reduced in strength by a factor of 2. And the epsilon parameters in the 6th column are half the size of the epsilon parameters in the 3rd column. The only exception to this are the last 4 atoms in the list (OT, HT, SOD, CLA), which I am guessing are solvent atoms and cannot participate in dihedrals. (This is a guess. I do not know.)

OT    0.00  -0.152100   1.768200  0.00  -0.152100   1.768200
HT    0.00  -0.046000   0.224500  0.00  -0.046000   0.22450
SOD   0.00  -0.000500   2.9969737 0.00  -0.000500   2.9969737
CLA   0.00  -0.710000   2.2561487 0.00  -0.710000   2.2561487

As for units, I suspect the epsilon parameters are in kcal/mol, and the sigma parameters are in Angstroms. However the sigma parameters seem to be reduced in size by a factor of 2. I could be wrong about everything in this post. Please check this for yourself before proceeding.

Thanks, vasi786 for the endorsement of CGenFF. (I will take a look at that again eventually.)

jagreathouse commented 3 years ago

In addition to adding atom types as comments, please also add:

comments to indicate the potential form (e.g. in LAMMPS, 'Bond Coeffs # harmonic'), and atom types for each set of coefficients (e.g., 1 268.0000 1.5290 # CT-CT)
group types for atoms, bonds, etc,. rather than listing each atom or bond as a unique type. For methane in LAMMPS, instead of '5 atom types' and '4 bond types' it should be '2 atom types' and '2 bond types'

Thanks

bkpgh commented 3 years ago

Just another vote for the request to include the OPLS atom type designations in at least some output file from LigParGen. The OPLS_800 type atom descriptors are confusing, especially to newbies, since they have exactly the same form as the standard OPLS atom types.

leelasd / ligpargen