forlilab / waterkit

Tool to predict water molecules placement and energy in ligand binding sites
GNU General Public License v3.0
24 stars 7 forks source link

Issue creating pdbqt file with ligand bound system #5

Closed MKCarter closed 10 months ago

MKCarter commented 1 year ago

Hi,

I am trying to generate the waterkit receptor using the updated wk_prepare_receptor.py script.

I have generated my Amber lib and frcmod files for my ligand using the following procedure:

antechamber -i ligand.pdb -fi pdb -o ligand.prepi -fo prepi -c bcc -nc 1 -at gaff2
parmchk2 -i ligand.prepi -f prepi -o ligand.frcmod -s gaff2 

Then to generate the lib file I have run:

tleap
loadamberprep ligand.prepi
saveoff UNK ligand.lib 

However, when running the following command the script fails to fun:

wk_prepare_receptor.py -i protein_ligand.pdb --lib ligand.lib --frcmod ligand.frcmod -o protein_prepared_ligand_test --pdb --amber_pdbqt

The error I obtain is as follows:

Warning: importing 'simtk.openmm' is deprecated.  Import 'openmm' instead.
INFO:WaterKit receptor preparation:Amber lib parameter files for residue(s): UNK
INFO:WaterKit receptor preparation:Removed all hydrogen atoms
INFO:WaterKit receptor preparation:Removed all water molecules
WARNING:WaterKit receptor preparation:Histidine protonation will be automatically assigned to HIE: HIS - 11, HIS - 31, HIS - 73, HIS - 173, HIS - 183, HIS - 187, HIS - 192
ERROR:WaterKit receptor preparation:Could not generate topology/coordinates files with tleap
Traceback (most recent call last):
  File "/Users/michaelcarter/opt/miniconda3/envs/waterkit_corr/lib/python3.9/site-packages/pdb4amber/utils.py", line 10, in easy_call
    output = subprocess.check_output(
  File "/Users/michaelcarter/opt/miniconda3/envs/waterkit_corr/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/Users/michaelcarter/opt/miniconda3/envs/waterkit_corr/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'tleap -s -f leap.template.in > leap.template.out' returned non-zero exit status 31.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 761, in prepare
    easy_call('tleap -s -f %s > %s' % (tleap_input, tleap_output), shell=True)
  File "/Users/michaelcarter/opt/miniconda3/envs/waterkit_corr/lib/python3.9/site-packages/pdb4amber/utils.py", line 14, in easy_call
    raise RuntimeError(e.output.decode())
RuntimeError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/michaelcarter/opt/miniconda3/envs/waterkit_corr/bin/wk_prepare_receptor.py", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 862, in <module>
    main()
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 846, in main
    pr.prepare(pdb_filename, lib_files, frcmod_files, clean)
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 765, in prepare
    raise RuntimeError(error_msg)
RuntimeError: Could not generate topology/coordinates files with tleap

I think this error stems from the lib file as I can run the command fine without it. Do you have any suggestions on how to get this working?

jeeberhardt commented 1 year ago

Hi,

Thanks for reporting this issue!

Would it possible to share the protein_ligand.pdb file with me? Also, one way to see what is going on would be to re-run the preparation with the --no-clean argument. That way you will have access to the temporary and log files.

Best,

MKCarter commented 1 year ago

Thanks for the suggestion. I have solved it now! There was a small error in the PDB file.

Many Thanks,

MKCarter commented 1 year ago

Hi,

I am trying to prepare a system with two ligands, I am running the following command:

wk_prepare_receptor.py -i protein.pdb --lib X_ligand.lib Y_ligand.lib --frcmod X_ligand.frcmod Y_ligand.frcmod -o protein_prepared --pdb --amber_pdbqt

But seeing this error:

Warning: importing 'simtk.openmm' is deprecated.  Import 'openmm' instead.
Traceback (most recent call last):
  File "/Users/michaelcarter/opt/miniconda3/envs/waterkit_corr/bin/wk_prepare_receptor.py", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 865, in <module>
    main()
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 849, in main
    pr.prepare(pdb_filename, lib_files, frcmod_files, clean)
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 630, in prepare
    logger.info('Amber lib parameter files for residue(s): %s' % nonstandard_resnames)
TypeError: not all arguments converted during string formatting

I have tried to comma separate the multiple lib and frcmod files but with no luck. Is there a way to do this that I am missing?

jeeberhardt commented 1 year ago

Hi,

That looks like a silly mistake I did in the code. Let me check, will come back to you asap.

Best,

MKCarter commented 1 year ago

Hi @jeeberhardt - thanks for looking into this 👍

jeeberhardt commented 1 year ago

Hi @MKCarter,

The issue should be fixed now. It was just a string formatting issue. Thanks for the feedback!

Best,

MKCarter commented 1 year ago

Hi @jeeberhardt - I have updated the code and it now does recongize the two bound ligands, but it appears to fail for another reason now. The code appears to not handle the terminal residue when running with two lib and frcmod files:

Warning: importing 'simtk.openmm' is deprecated.  Import 'openmm' instead.
INFO:WaterKit receptor preparation:Amber lib parameter files for residue(s): EGU, FAD
INFO:WaterKit receptor preparation:Removed all hydrogen atoms
INFO:WaterKit receptor preparation:Removed all water molecules
ERROR:WaterKit receptor preparation: Gap(s) were found between some residues. You can fix that issue by adding the missing residues or manually add TER records to indicate that the residues/chains are not physically connected to each other. If you know what you are doing, you can use the --ignore_gaps option to ignore them (automatically add TER records).
Gap(s) found between the following residues: 
 - gap of 9999.999 A between ILE 440 and HIS 441

Traceback (most recent call last):
  File "/Users/michaelcarter/opt/miniconda3/envs/waterkit_corr/bin/wk_prepare_receptor.py", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 873, in <module>
    main()
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 857, in main
    pr.prepare(pdb_filename, lib_files, frcmod_files, clean)
  File "/Users/michaelcarter/DD_tools/waterkit/scripts/wk_prepare_receptor.py", line 697, in prepare
    raise RuntimeError(error_msg)
RuntimeError:  Gap(s) were found between some residues. You can fix that issue by adding the missing residues or manually add TER records to indicate that the residues/chains are not physically connected to each other. If you know what you are doing, you can use the --ignore_gaps option to ignore them (automatically add TER records).
Gap(s) found between the following residues: 
 - gap of 9999.999 A between ILE 440 and HIS 441

I have re-run this with the same protein, but only considering 1 bound ligand and the receptor is prepared with no errors. Any ideas why this happens? Thanks

MKCarter commented 1 year ago

This is the successful output from running with only one lib and frcmod

Warning: importing 'simtk.openmm' is deprecated.  Import 'openmm' instead.
INFO:WaterKit receptor preparation:Amber lib parameter files for residue(s): EGU
INFO:WaterKit receptor preparation:Removed all hydrogen atoms
INFO:WaterKit receptor preparation:Removed all water molecules
INFO:WaterKit receptor preparation:Removed all non-standard Amber residues: FAD
WARNING:WaterKit receptor preparation:Histidine protonation will be automatically assigned to HIE: HIE - 60, HIE - 88, HIE - 131, HIE - 134, HIE - 176, HIE - 192, HIE - 214, HIE - 294, HIE - 298, HIE - 307, HIE - 354, HIE - 370, HIE - 387, HIE - 409, HIE - 416, HIE - 443
jeeberhardt commented 1 year ago

HI @MKCarter,

This usually happens when two residues are not physically connected to each other (last residue of chain A and first residue of chain B for example). In that case you will need to add the TER keyword in between to explicitly mark the separation of the two chains.

One could also use the --ignore_gaps argument to ignore that gap, but I would STRONGLY advice you to check that there is truly a gap between ILE 440 and HIS441.

Did you renumber the residues? Or could you share the PDB where you have these two residues? It is quite hard to know what is truly happening without looking at the structure.

MKCarter commented 1 year ago

Hi @jeeberhardt

I think this problem is something related to the changes made. The same pdb prepares fine with one of the ligands defined, but fails when I try to define both ligands. I think the protein structure is fine. I have attached a google drive link to the PDB/lib/frcmod files to help with the debugging

https://drive.google.com/drive/folders/1RkbEOztxH2zd_Yus02pHFp7RKIewWiHe?usp=share_link

Many thanks 👍

jeeberhardt commented 1 year ago

Hi @MKCarter,

I don't have access to a computer right now to try out, but if you add the TER keyword between each water molecules (500, 501, 502 and 503), between HOH 503 and FAD and between FAD and EGU, it should work.

MKCarter commented 1 year ago

@jeeberhardt - Thanks, that did the trick. It is working as expected now!

jeeberhardt commented 1 year ago

Perfect then!

Although the error message was not correct. The issue was not related to the two residues reported...