forlilab / waterkit

Tool to predict water molecules placement and energy in ligand binding sites
GNU General Public License v3.0
24 stars 7 forks source link

failed steps in receptor preparation; documentation on receptor prep? #13

Closed blakemertz closed 5 months ago

blakemertz commented 5 months ago

tleap has failed to generate the topology/coords for my input pdb:

python /media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py -i MDH-258_molrep7_refmac3-coot-0.pdb -o MDH-258-xtal_prepared --pdb --amber_pdbqt
WARNING:WaterKit receptor preparation:Found residue(s) with missing heavy atoms: [<Residue LYS[32]; chain=A; segid=A--->, 4], [<Residue ASN[34]; chain=A; segid=A--->, 3], [<Residue LYS[38]; chain=A; segid=A--->, 3], [<Residue LYS[46]; chain=A; segid=A--->, 2], [<Residue LYS[78]; chain=A; segid=A--->, 4], [<Residue LYS[87]; chain=A; segid=A--->, 3], [<Residue LYS[120]; chain=A; segid=A--->, 2], [<Residue LYS[159]; chain=A; segid=A--->, 3], [<Residue LYS[162]; chain=A; segid=A--->, 4], [<Residue ARG[181]; chain=A; segid=A--->, 6], [<Residue GLN[204]; chain=A; segid=A--->, 4], [<Residue LYS[207]; chain=A; segid=A--->, 1], [<Residue ASP[211]; chain=A; segid=A--->, 3], [<Residue ASP[212]; chain=A; segid=A--->, 3], [<Residue ASP[213]; chain=A; segid=A--->, 3], [<Residue ASN[214]; chain=A; segid=A--->, 3], [<Residue LYS[227]; chain=A; segid=A--->, 4], [<Residue LYS[239]; chain=A; segid=A--->, 2], [<Residue SER[244]; chain=A; segid=A--->, 1], [<Residue LYS[247]; chain=A; segid=A--->, 4], [<Residue LYS[259]; chain=A; segid=A--->, 3], [<Residue LYS[272]; chain=A; segid=A--->, 2], [<Residue GLN[289]; chain=A; segid=A--->, 3], [<Residue GLN[298]; chain=A; segid=A--->, 3], [<Residue GLU[322]; chain=A; segid=A--->, 3], [<Residue LYS[331]; chain=A; segid=A--->, 4]
INFO:WaterKit receptor preparation:Removed all hydrogen atoms
INFO:WaterKit receptor preparation:Removed all water molecules
INFO:WaterKit receptor preparation:Removed all non-standard Amber residues: LIG
INFO:WaterKit receptor preparation:Histidine protonation states were automatically set to: HIE - 44, HIE - 86, HIE - 128, HIE - 170, HIE - 178, HIE - 255, HIE - 300, HIE - 310
INFO:WaterKit receptor preparation:Lysine protonation states were automatically set to: LYN - 32, LYN - 38, LYN - 46, LYN - 52, LYN - 67, LYN - 70, LYN - 78, LYN - 81, LYN - 87, LYN - 109, LYN - 112, LYN - 120, LYN - 123, LYN - 126, LYN - 159, LYN - 162, LYN - 165, LYN - 207, LYN - 221, LYN - 227, LYN - 239, LYN - 247, LYN - 250, LYN - 256, LYN - 259, LYN - 272, LYN - 291, LYN - 296, LYN - 325, LYN - 328, LYN - 331
INFO:WaterKit receptor preparation:Cysteine protonation states were automatically set to: CYM - 121, CYM - 134, CYM - 135, CYM - 163, CYM - 195, CYM - 205, CYM - 248, CYM - 261, CYM - 309, CYM - 334
ERROR:WaterKit receptor preparation:Could not generate topology/coordinates files with tleap
Traceback (most recent call last):
  File "/media/bak11/binaries/miniconda3/envs/waterkit/lib/python3.11/site-packages/pdb4amber/utils.py", line 10, in easy_call
    output = subprocess.check_output(
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/bak11/binaries/miniconda3/envs/waterkit/lib/python3.11/subprocess.py", line 466, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/bak11/binaries/miniconda3/envs/waterkit/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'tleap -s -f leap.template.in > leap.template.out' returned non-zero exit status 31.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 892, in prepare
    easy_call('tleap -s -f %s > %s' % (tleap_input, tleap_output), shell=True)
  File "/media/bak11/binaries/miniconda3/envs/waterkit/lib/python3.11/site-packages/pdb4amber/utils.py", line 14, in easy_call
    raise RuntimeError(e.output.decode())
RuntimeError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 997, in <module>
    main()
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 981, in main
    pr.prepare(pdb_filename, lib_files, frcmod_files, clean)
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 896, in prepare
    raise RuntimeError(error_msg)
RuntimeError: Could not generate topology/coordinates files with tleap

I was able to get waterkit to work on a test case on a crystal structure of rhodopsin, but this was only after I 1) ran the protein preparation wizard in Maestro and 2) exported the pdb in PyMOL. For some reason parmed isnt detecting the pdb produced by Maestro. Below is the error message I got from waterkit after putting it through Maestro (reproduced on two different protein structures):

 python /media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py -i 3pqr_prepared.pd b -o 3pqr_wk_prepared --pdb --amber_pdbqt
Traceback (most recent call last):
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 997, in <module>
    main()
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 981, in main
    pr.prepare(pdb_filename, lib_files, frcmod_files, clean)
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 728, in prepare
    receptor = pmd.load_file(pdb_filename)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/bak11/binaries/miniconda3/envs/waterkit/lib/python3.11/site-packages/parmed/formats/registry.py", line 180, in load_file
    raise FormatNotFound('Could not identify file format')
parmed.exceptions.FormatNotFound: Could not identify file format

On my protein that I am trying to directly work on, I put it through Maestro and PyMOL, but got a different error message:

python /media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py -i  mdh-258-pymol.pdb -o mdh-258-xtal_prepared --pdb --amber_pdbqt
WARNING:WaterKit receptor preparation:Found residue(s) with missing heavy atoms: [<Residue LYS[32]; chain=A>, 4], [<Residue PRO[33]; chain=A>, 7], [< Residue ASN[34]; chain=A>, 4], [<Residue TYR[35]; chain=A>, 12], [<Residue ALA[36]; chain=A>, 5], [<Residue LEU[37]; chain=A>, 8], [<Residue LYS[38];  chain=A>, 4], [<Residue PHE[39]; chain=A>, 11], [<Residue THR[40]; chain=A>, 7], [<Residue LEU[41]; chain=A>, 8], [<Residue ALA[42]; chain=A>, 5], [ <Residue GLY[43]; chain=A>, 4], [<Residue HIS[44]; chain=A>, 10], [<Residue THR[45]; chain=A>, 7], [<Residue LYS[46]; chain=A>, 4], [<Residue ALA[47] ; chain=A>, 5], [<Residue VAL[48]; chain=A>, 7], [<Residue SER[49]; chain=A>, 6], [<Residue SER[50]; chain=A>, 6], [<Residue VAL[51]; chain=A>, 7], [ <Residue LYS[52]; chain=A>, 9], [<Residue PHE[53]; chain=A>, 11], [<Residue SER[54]; chain=A>, 6], [<Residue PRO[55]; chain=A>, 7], [<Residue ASN[56] ; chain=A>, 8], [<Residue GLY[57]; chain=A>, 4], [<Residue GLU[58]; chain=A>, 9], [<Residue TRP[59]; chain=A>, 14], [<Residue LEU[60]; chain=A>, 8],  [<Residue ALA[61]; chain=A>, 5], [<Residue SER[62]; chain=A>, 6], [<Residue SER[63]; chain=A>, 6], [<Residue SER[64]; chain=A>, 6], [<Residue ALA[65] ; chain=A>, 5], [<Residue ASP[66]; chain=A>, 8], [<Residue LYS[67]; chain=A>, 9], [<Residue LEU[68]; chain=A>, 8], [<Residue ILE[69]; chain=A>, 8], [ <Residue LYS[70]; chain=A>, 9], [<Residue ILE[71]; chain=A>, 8], [<Residue TRP[72]; chain=A>, 14], [<Residue GLY[73]; chain=A>, 4], [<Residue ALA[74] ; chain=A>, 5], [<Residue TYR[75]; chain=A>, 12], [<Residue ASP[76]; chain=A>, 8], [<Residue GLY[77]; chain=A>, 4], [<Residue LYS[78]; chain=A>, 4],  [<Residue PHE[79]; chain=A>, 11], [<Residue GLU[80]; chain=A>, 9], [<Residue LYS[81]; chain=A>, 9], [<Residue THR[82]; chain=A>, 7], [<Residue ILE[83 ]; chain=A>, 8], [<Residue SER[84]; chain=A>, 6], [<Residue GLY[85]; chain=A>, 4], [<Residue HIS[86]; chain=A>, 10], [<Residue LYS[87]; chain=A>, 4],  [<Residue LEU[88]; chain=A>, 8], [<Residue GLY[89]; chain=A>, 4], [<Residue ILE[90]; chain=A>, 8], [<Residue SER[91]; chain=A>, 6], [<Residue ASP[92 ]; chain=A>, 8], [<Residue VAL[93]; chain=A>, 7], [<Residue ALA[94]; chain=A>, 5], [<Residue TRP[95]; chain=A>, 14], [<Residue SER[96]; chain=A>, 6],  [<Residue SER[97]; chain=A>, 6], [<Residue ASP[98]; chain=A>, 8], [<Residue SER[99]; chain=A>, 6], [<Residue ASN[100]; chain=A>, 8], [<Residue LEU[1 01]; chain=A>, 8], [<Residue LEU[102]; chain=A>, 8], [<Residue VAL[103]; chain=A>, 7], [<Residue SER[104]; chain=A>, 6], [<Residue ALA[105]; chain=A> , 5], [<Residue SER[106]; chain=A>, 6], [<Residue ASP[107]; chain=A>, 8], [<Residue ASP[108]; chain=A>, 8], [<Residue LYS[109]; chain=A>, 9], [<Resid ue THR[110]; chain=A>, 7], [<Residue LEU[111]; chain=A>, 8], [<Residue LYS[112]; chain=A>, 9], [<Residue ILE[113]; chain=A>, 8], [<Residue TRP[114];  chain=A>, 14], [<Residue ASP[115]; chain=A>, 8], [<Residue VAL[116]; chain=A>, 7], [<Residue SER[117]; chain=A>, 6], [<Residue SER[118]; chain=A>, 6] , [<Residue GLY[119]; chain=A>, 4], [<Residue LYS[120]; chain=A>, 4], [<Residue CYS[121]; chain=A>, 6], [<Residue LEU[122]; chain=A>, 8], [<Residue L YS[123]; chain=A>, 9], [<Residue THR[124]; chain=A>, 7], [<Residue LEU[125]; chain=A>, 8], [<Residue LYS[126]; chain=A>, 9], [<Residue GLY[127]; chai n=A>, 4], [<Residue HIS[128]; chain=A>, 10], [<Residue SER[129]; chain=A>, 6], [<Residue ASN[130]; chain=A>, 8], [<Residue TYR[131]; chain=A>, 12], [ <Residue VAL[132]; chain=A>, 7], [<Residue PHE[133]; chain=A>, 11], [<Residue CYS[134]; chain=A>, 6], [<Residue CYS[135]; chain=A>, 6], [<Residue ASN [136]; chain=A>, 8], [<Residue PHE[137]; chain=A>, 11], [<Residue ASN[138]; chain=A>, 8], [<Residue PRO[139]; chain=A>, 7], [<Residue GLN[140]; chain =A>, 9], [<Residue SER[141]; chain=A>, 6], [<Residue ASN[142]; chain=A>, 8], [<Residue LEU[143]; chain=A>, 8], [<Residue ILE[144]; chain=A>, 8], [<Re sidue VAL[145]; chain=A>, 7], [<Residue SER[146]; chain=A>, 6], [<Residue GLY[147]; chain=A>, 4], [<Residue SER[148]; chain=A>, 6], [<Residue PHE[149 ]; chain=A>, 11], [<Residue ASP[150]; chain=A>, 8], [<Residue GLU[151]; chain=A>, 9], [<Residue SER[152]; chain=A>, 6], [<Residue VAL[153]; chain=A>,  7], [<Residue ARG[154]; chain=A>, 11], [<Residue ILE[155]; chain=A>, 8], [<Residue TRP[156]; chain=A>, 14], [<Residue ASP[157]; chain=A>, 8], [<Resi due VAL[158]; chain=A>, 7], [<Residue LYS[159]; chain=A>, 4], [<Residue THR[160]; chain=A>, 7], [<Residue GLY[161]; chain=A>, 4], [<Residue LYS[162];  chain=A>, 4], [<Residue CYS[163]; chain=A>, 6], [<Residue LEU[164]; chain=A>, 8], [<Residue LYS[165]; chain=A>, 9], [<Residue THR[166]; chain=A>, 7] , [<Residue LEU[167]; chain=A>, 8], [<Residue PRO[168]; chain=A>, 7], [<Residue ALA[169]; chain=A>, 5], [<Residue HIS[170]; chain=A>, 10], [<Residue  SER[171]; chain=A>, 6], [<Residue ASP[172]; chain=A>, 8], [<Residue PRO[173]; chain=A>, 7], [<Residue VAL[174]; chain=A>, 7], [<Residue SER[175]; cha in=A>, 6], [<Residue ALA[176]; chain=A>, 5], [<Residue VAL[177]; chain=A>, 7], [<Residue HIS[178]; chain=A>, 10], [<Residue PHE[179]; chain=A>, 11],  [<Residue ASN[180]; chain=A>, 8], [<Residue ARG[181]; chain=A>, 4], [<Residue ASP[182]; chain=A>, 8], [<Residue GLY[183]; chain=A>, 4], [<Residue SER [184]; chain=A>, 6], [<Residue LEU[185]; chain=A>, 8], [<Residue ILE[186]; chain=A>, 8], [<Residue VAL[187]; chain=A>, 7], [<Residue SER[188]; chain= A>, 6], [<Residue SER[189]; chain=A>, 6], [<Residue SER[190]; chain=A>, 6], [<Residue TYR[191]; chain=A>, 12], [<Residue ASP[192]; chain=A>, 8], [<Re sidue GLY[193]; chain=A>, 4], [<Residue LEU[194]; chain=A>, 8], [<Residue CYS[195]; chain=A>, 6], [<Residue ARG[196]; chain=A>, 11], [<Residue ILE[19 7]; chain=A>, 8], [<Residue TRP[198]; chain=A>, 14], [<Residue ASP[199]; chain=A>, 8], [<Residue THR[200]; chain=A>, 7], [<Residue ALA[201]; chain=A> , 5], [<Residue SER[202]; chain=A>, 6], [<Residue GLY[203]; chain=A>, 4], [<Residue GLN[204]; chain=A>, 4], [<Residue CYS[205]; chain=A>, 6], [<Resid ue LEU[206]; chain=A>, 8], [<Residue LYS[207]; chain=A>, 4], [<Residue THR[208]; chain=A>, 7], [<Residue LEU[209]; chain=A>, 8], [<Residue ILE[210];  chain=A>, 8], [<Residue ASP[211]; chain=A>, 4], [<Residue ASP[212]; chain=A>, 4], [<Residue ASP[213]; chain=A>, 4], [<Residue ASN[214]; chain=A>, 4],  [<Residue PRO[215]; chain=A>, 7], [<Residue PRO[216]; chain=A>, 7], [<Residue VAL[217]; chain=A>, 7], [<Residue SER[218]; chain=A>, 6], [<Residue PH E[219]; chain=A>, 11], [<Residue VAL[220]; chain=A>, 7], [<Residue LYS[221]; chain=A>, 9], [<Residue PHE[222]; chain=A>, 11], [<Residue SER[223]; cha in=A>, 6], [<Residue PRO[224]; chain=A>, 7], [<Residue ASN[225]; chain=A>, 8], [<Residue GLY[226]; chain=A>, 4], [<Residue LYS[227]; chain=A>, 4], [< Residue TYR[228]; chain=A>, 12], [<Residue ILE[229]; chain=A>, 8], [<Residue LEU[230]; chain=A>, 8], [<Residue ALA[231]; chain=A>, 5], [<Residue ALA[ 232]; chain=A>, 5], [<Residue THR[233]; chain=A>, 7], [<Residue LEU[234]; chain=A>, 8], [<Residue ASP[235]; chain=A>, 8], [<Residue ASN[236]; chain=A >, 8], [<Residue THR[237]; chain=A>, 7], [<Residue LEU[238]; chain=A>, 8], [<Residue LYS[239]; chain=A>, 4], [<Residue LEU[240]; chain=A>, 8], [<Resi due TRP[241]; chain=A>, 14], [<Residue ASP[242]; chain=A>, 8], [<Residue TYR[243]; chain=A>, 12], [<Residue SER[244]; chain=A>, 4], [<Residue SER[245 ]; chain=A>, 6], [<Residue GLY[246]; chain=A>, 4], [<Residue LYS[247]; chain=A>, 4], [<Residue CYS[248]; chain=A>, 6], [<Residue LEU[249]; chain=A>,  8], [<Residue LYS[250]; chain=A>, 9], [<Residue THR[251]; chain=A>, 7], [<Residue TYR[252]; chain=A>, 12], [<Residue THR[253]; chain=A>, 7], [<Residu e GLY[254]; chain=A>, 4], [<Residue HIS[255]; chain=A>, 10], [<Residue LYS[256]; chain=A>, 9], [<Residue ASN[257]; chain=A>, 8], [<Residue GLU[258];  chain=A>, 9], [<Residue LYS[259]; chain=A>, 4], [<Residue TYR[260]; chain=A>, 12], [<Residue CYS[261]; chain=A>, 6], [<Residue ILE[262]; chain=A>, 8] , [<Residue PHE[263]; chain=A>, 11], [<Residue ALA[264]; chain=A>, 5], [<Residue ASN[265]; chain=A>, 8], [<Residue PHE[266]; chain=A>, 11], [<Residue  SER[267]; chain=A>, 6], [<Residue VAL[268]; chain=A>, 7], [<Residue THR[269]; chain=A>, 7], [<Residue GLY[270]; chain=A>, 4], [<Residue GLY[271]; ch ain=A>, 4], [<Residue LYS[272]; chain=A>, 4], [<Residue TRP[273]; chain=A>, 14], [<Residue ILE[274]; chain=A>, 8], [<Residue VAL[275]; chain=A>, 7],  [<Residue SER[276]; chain=A>, 6], [<Residue GLY[277]; chain=A>, 4], [<Residue SER[278]; chain=A>, 6], [<Residue GLU[279]; chain=A>, 9], [<Residue ASP [280]; chain=A>, 8], [<Residue ASN[281]; chain=A>, 8], [<Residue LEU[282]; chain=A>, 8], [<Residue VAL[283]; chain=A>, 7], [<Residue TYR[284]; chain= A>, 12], [<Residue ILE[285]; chain=A>, 8], [<Residue TRP[286]; chain=A>, 14], [<Residue ASN[287]; chain=A>, 8], [<Residue LEU[288]; chain=A>, 8], [<R esidue GLN[289]; chain=A>, 4], [<Residue THR[290]; chain=A>, 7], [<Residue LYS[291]; chain=A>, 9], [<Residue GLU[292]; chain=A>, 9], [<Residue ILE[29 3]; chain=A>, 8], [<Residue VAL[294]; chain=A>, 7], [<Residue GLN[295]; chain=A>, 9], [<Residue LYS[296]; chain=A>, 9], [<Residue LEU[297]; chain=A>,  8], [<Residue GLN[298]; chain=A>, 4], [<Residue GLY[299]; chain=A>, 4], [<Residue HIS[300]; chain=A>, 10], [<Residue THR[301]; chain=A>, 7], [<Resid ue ASP[302]; chain=A>, 8], [<Residue VAL[303]; chain=A>, 7], [<Residue VAL[304]; chain=A>, 7], [<Residue ILE[305]; chain=A>, 8], [<Residue SER[306];  chain=A>, 6], [<Residue THR[307]; chain=A>, 7], [<Residue ALA[308]; chain=A>, 5], [<Residue CYS[309]; chain=A>, 6], [<Residue HIS[310]; chain=A>, 10] , [<Residue PRO[311]; chain=A>, 7], [<Residue THR[312]; chain=A>, 7], [<Residue GLU[313]; chain=A>, 9], [<Residue ASN[314]; chain=A>, 8], [<Residue I LE[315]; chain=A>, 8], [<Residue ILE[316]; chain=A>, 8], [<Residue ALA[317]; chain=A>, 5], [<Residue SER[318]; chain=A>, 6], [<Residue ALA[319]; chai n=A>, 5], [<Residue ALA[320]; chain=A>, 5], [<Residue LEU[321]; chain=A>, 8], [<Residue GLU[322]; chain=A>, 4], [<Residue ASN[323]; chain=A>, 8], [<R esidue ASP[324]; chain=A>, 8], [<Residue LYS[325]; chain=A>, 9], [<Residue THR[326]; chain=A>, 7], [<Residue ILE[327]; chain=A>, 8], [<Residue LYS[32 8]; chain=A>, 9], [<Residue LEU[329]; chain=A>, 8], [<Residue TRP[330]; chain=A>, 14], [<Residue LYS[331]; chain=A>, 4], [<Residue SER[332]; chain=A> , 6], [<Residue ASP[333]; chain=A>, 8], [<Residue CYS[334]; chain=A>, 6], [<Residue LYS[32]; chain=A; segid=A--->, 5], [<Residue ASN[34]; chain=A; se gid=A--->, 4], [<Residue LYS[38]; chain=A; segid=A--->, 5], [<Residue LYS[46]; chain=A; segid=A--->, 5], [<Residue LYS[78]; chain=A; segid=A--->, 5],  [<Residue LYS[87]; chain=A; segid=A--->, 5], [<Residue LYS[120]; chain=A; segid=A--->, 5], [<Residue LYS[159]; chain=A; segid=A--->, 5], [<Residue L YS[162]; chain=A; segid=A--->, 5], [<Residue ARG[181]; chain=A; segid=A--->, 7], [<Residue GLN[204]; chain=A; segid=A--->, 5], [<Residue LYS[207]; ch ain=A; segid=A--->, 5], [<Residue ASP[211]; chain=A; segid=A--->, 4], [<Residue ASP[212]; chain=A; segid=A--->, 4], [<Residue ASP[213]; chain=A; segi d=A--->, 4], [<Residue ASN[214]; chain=A; segid=A--->, 4], [<Residue LYS[227]; chain=A; segid=A--->, 5], [<Residue LYS[239]; chain=A; segid=A--->, 5] , [<Residue SER[244]; chain=A; segid=A--->, 2], [<Residue LYS[247]; chain=A; segid=A--->, 5], [<Residue LYS[259]; chain=A; segid=A--->, 5], [<Residue  LYS[272]; chain=A; segid=A--->, 5], [<Residue GLN[289]; chain=A; segid=A--->, 5], [<Residue GLN[298]; chain=A; segid=A--->, 5], [<Residue GLU[322];  chain=A; segid=A--->, 5], [<Residue LYS[331]; chain=A; segid=A--->, 5]
INFO:WaterKit receptor preparation:Removed all hydrogen atoms
INFO:WaterKit receptor preparation:Removed all water molecules
INFO:WaterKit receptor preparation:Removed all non-standard Amber residues: LIG, NMA
Traceback (most recent call last):
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 494, in _find_gaps
    n_atom = [atom for atom in next_residue.atoms if atom.name == 'N'][0]
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 997, in <module>
    main()
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 981, in main
    pr.prepare(pdb_filename, lib_files, frcmod_files, clean)
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 777, in prepare
    gaplist = _find_gaps(pdbfixer.parm, RESPROT, self._fill_gaps)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/bak11/binaries/git/waterkit/scripts/wk_prepare_receptor.py", line 496, in _find_gaps
    gaprecord = (9999.999, c_atom.residue.name, i + 1, n_atom.residue.name, i + 2)
                                                       ^^^^^^
UnboundLocalError: cannot access local variable 'n_atom' where it is not associated with a value

It is not clear to me why waterkit/ambertools failed to properly assign n_atom? Thanks for the help with this.

jeeberhardt commented 5 months ago

Thanks for the feedback!

For the first issue, it might be related to the hydrogen atoms. What happens if you add the --keep_hydrogen argument to the command line? (Also I already see a stupid bug that I would need to fix once I have access to a computer..)

For the second issue, isn't it related to the space between pd and b in the input pdb filename?

For the third and last issue, if you add a TER record between the protein and the following ligand, and between each ligand after you should not have this issue anymore. FYI, the script is expecting the input to (kind of) follow the specification of the PDB format, so the end of a chain (ATOM/HETATM) must be specified by a TER record. Otherwise it is a bit hard to know if a gap observed between two residues is « normal » or not.

blakemertz commented 5 months ago

@jeeberhardt thanks for the quick response -- apologies for not replying sooner.

I needed to start digging through my pdb files to get them to process correctly in waterkit. Thank you for catching the improper file extension on that pdb in my 2nd issue.

The first issue I'm not sure will get resolved anytime soon -- there is something a little funky in how Maestro is exporting files into pdb format that ambertools doesn't handle properly, so for now I will just stick with the PyMOL workaround. I was able to resolve the 3rd issue with my pdb that was exported in PyMOL. Maestro was inserting "A---" in columns 73-76 for a noticeable fraction of my atom entries for some reason. To get around that, I had to do the following:

  1. Export to pdb in Maestro
  2. Remove the "A---" from all ATOM entries
  3. Load into PyMOL and export to pdb.

I'll mark this closed, will pick this up again if I get some fresh ideas on why Maestro is introducing that funk in the pdb file.