forlilab / waterkit

Tool to predict water molecules placement and energy in ligand binding sites
GNU General Public License v3.0
24 stars 7 forks source link

Working directly with parm7 and coordinate files #18

Open amin-sagar opened 1 month ago

amin-sagar commented 1 month ago

Hello.

Thanks for this awesome work.

I have a protein-ligand system parameterized with openff. I can save it as amber format parm7 or prmtop files. But, it's not very straightforward to save frcmod and lib files https://github.com/openforcefield/openff-toolkit/issues/304

Do you think there could be a way to work directly with parm7/prmtop files?

I would be really grateful for any suggestions.

Best, Amin.

rwxayheee commented 1 month ago

Hi @amin-sagar Using parmed it might be possible to write Amber OFF library and frcmod files (with some restrictions). In theory it should be possible to extract (at least recognize) modified parameters by comparing a prmtop to these default force field, although this may not be the most straightforward way to prepre receptor with waterkit.. It seems possible to skip the leap part in receptor preparation and jump to here if you already made the prmtop file.

amin-sagar commented 3 weeks ago

Thanks @rwxayheee And sorry for the delay. I can also use MDAnalysis to export the parm and coordinate files to pdbqt file. So, I can skip to the step

run_waterkit.py -i protein_prepared_amber.pdbqt -c X Y Z -s SX SY SZ -n 10000 -j 16 -o traj

However, at this point autogrid doesn't identify the atomtypes. Is there a way to guess/assign atomtypes which are suitable for autogrid?

Best, Amin.

diogomart commented 3 weeks ago

@amin-sagar based on your issue on the espaloma repository I see you want to use parameters saved in the prmtop file. I think this would be a bit of work to do. A few notes:

amin-sagar commented 3 weeks ago

Thanks @diogomart If I am understanding the way waterkit works, once I have the pdbqt file with the espaloma charges, the only thing I need is the correct atom types. For example, my current pdbqt file looks like this

TITLE     FRAME 0 FROM system.gro
CRYST1   75.480   75.480   75.480  60.00  60.00  90.00 P 1           1
ATOM      1  C1x UNK M 411      12.600 -16.930 -15.800  1.00  0.00    -0.169 C3
ATOM      2  H1x UNK M 411      12.470 -17.620 -15.090  1.00  0.00     0.068 H9
ATOM      3  H2x UNK M 411      13.560 -16.960 -16.080  1.00  0.00     0.068 H9
ATOM      4  H3x UNK M 411      12.050 -17.190 -16.590  1.00  0.00     0.077 H9
ATOM      5  C2x UNK M 411      12.220 -15.540 -15.290  1.00  0.00     0.652 C4
ATOM      6  C3x UNK M 411      11.980 -13.060 -15.760  1.00  0.00     0.044 C3
ATOM      7  H4x UNK M 411      12.190 -12.970 -14.790  1.00  0.00     0.099 H1

If I could make espaloma or another tool to give atom types that the enhanced version of autogrid likes, then things should work. Am I understanding this correctly?

Best, Amin.

jeeberhardt commented 3 weeks ago

Hi @amin-sagar,

Yes, that's right! Here is the list of all the atom types supported in waterkit (ff14SB + GAFF2): https://github.com/forlilab/waterkit/blob/master/data/ff14SB_parameters.dat

Not sure how important is this for you to use the partial charges from espaloma, but I added a script to generate the GAFF2 parameters (frcmod and lib files) from a mol/sdf file:

python scripts/wk_generate_gaff2_parameters.py -i mol.sdf -n LIG

The input molecule must have explicit hydrogen atoms, and the desired protonation state. A 3-letters code must also be provided, and corresponding to the residue name (resname) of the molecule in the input PDB file (receptor + small molecule).

amin-sagar commented 3 weeks ago

Thanks @jeeberhardt The script that you have added would be great for running waterkit with small molecule ligands. The problem I have and the reason I wanted to use something like espaloma or grappa to assign atom types was that I am working with modified peptides. These can be up to 20 residues long and running antechamber on them can take a very long time. I will try to use your script anyway and see how long it takes.

Best, Amin.

jeeberhardt commented 3 weeks ago

Then one (quick and dirty) option would be to first get the GAFF2 atom types using the gasteiger charge method (which is almost instantaneous) instead of bcc, and then replace them by the partial charges obtained from espaloma.

python scripts/wk_generate_gaff2_parameters.py -i mol.sdf -n LIG -c gas

I added the -c option so you can define the charge method. See here for more details: https://ambermd.org/antechamber/ac.html#antechamber