Closed michaelwitting closed 2 years ago
PS:
running python -m isicle.scripts.generate_adduct /home/michael/lipidFiles/output/adducts/geometry_parent/IJFVSSZAOYLHEE-SSEXGKCCSA-N.mol2 /home/michael/lipidFiles/output/adducts/pKa/IJFVSSZAOYLHEE-SSEXGKCCSA-N.pka [+Na] /home/michael/lipidFiles/output/adducts/geometry_+Na/IJFVSSZAOYLHEE-SSEXGKCCSA-N_+Na.mol2 /home/michael/lipidFiles/output/adducts/geometry_+Na/IJFVSSZAOYLHEE-SSEXGKCCSA-N_+Na.xyz /home/michael/lipidFiles/output/adducts/geometry_+Na/IJFVSSZAOYLHEE-SSEXGKCCSA-N_+Na.pdb /home/michael/lipidFiles/output/adducts/geometry_+Na/IJFVSSZAOYLHEE-SSEXGKCCSA-N_+Na.charge --forcefield gaff --steps 500 &> /home/michael/lipidFiles/output/adducts/geometry_+Na/logs/IJFVSSZAOYLHEE-SSEXGKCCSA-N_+Na.log
directly in the command line returns no error!
PPS: I'm using Ubuntu 18.04.
I tested the provided input on my machine and experienced the same issue. The command returns no error when running manually because all outputs are being piped to the log file (Snakemake is aware of the error despite outputs being piped to the log). Looking there first, we see:
File "/Users/colb804/anaconda3/envs/isicle/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/Users/colb804/anaconda3/envs/isicle/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/colb804/workspace/isicle/isicle/scripts/generate_adduct.py", line 54, in <module>
pka = read_pka(args.pkafile)
File "/Users/colb804/workspace/isicle/isicle/utils.py", line 42, in read_pka
idx = [int(x) - 1 for x in df['atoms'].values[0].split(',')]
AttributeError: 'numpy.float64' object has no attribute 'split'
This suggests that the pKa output generated by cxcalc (ChemAxon) is unexpected, and as such the (not-very-robust) parser fails. Looking at the pKa file, we see it is essentially empty (just a skeleton):
$ cat output/adducts/pKa/test.pka
id apKa1 apKa2 bpKa1 bpKa2 atoms
1
To check what might have gone wrong with the pKa calculation, we can take a look at the corresponding log file:
$ cat output/adducts/pKa/logs/test.log
[H][#8][P@](=O)([#8][C@]([H])([H])[C@]([H])([H])[N@@](C([H])([H])[H])(C([H])([H])[H])C([H])([H])[H])[#8][C@@]([H])([H])[C@]([H])([#8]-[#6](=O)[C@]([H])([H])[C@]([H])([H])[C@]([H])([H])[C@]([H])([H])[C@]([H])([H])[C@]([H])([H])[C@]([H])([H])[C@]([H])([H])[C@]([H])([H])[C@]([H])([H])C([H])([H])[H])[C@@]([H])([H])[#8]-[#6](=O)[C@@]([H])([H])[C@@]([H])([H])[C@@]([H])([H])[C@@]([H])([H])[C@@]([H])([H])[C@@]([H])([H])[C@@]([H])([H])[C@@]([H])([H])[C@@]([H])([H])[C@@]([H])([H])C([H])([H])[H]
1 "pka: Inconsistent molecular structure."
Somewhat concerning, this error output looks nothing like your input SMILES, even when considering the various preprocessing steps taken by isicle (SMILES canonicalization, desalting, neutralization, tautomerization). Looking at the outputs of each of these steps, the structure is reasonably preserved. But note that pKa is calculated from a .mol2 geometry -- this is where I think things broke down. If we use cxcalc to generate pKa on the tautomerized SMILES, we get the following:
$ cxcalc pka -i -40 -x 40 -d large output/adducts/tautomer/test.smi
id apKa1 apKa2 bpKa1 bpKa2 atoms
1 1.86 18.84 -6.74 -7.35 20,28,13,31
Running on the .mol2 output (from rule generateGeometry), we get the failed pKa calculation. The problem is, there is no guarantee the atom indices calculated from the tautomerized SMILES will match those of the 3D .mol2 geometry used in later steps, which is why we calculate pKa from the .mol2 file. You could try replacing the content of output/adducts/pKa/test.pka
with the above, but again, no guarantee the indices will match those of the .mol2 that carries on (manually verification required).
That aside, I greatly appreciate you taking the time to file this issue. We are now aware of this edge case, and will work to ameliorate the issue on multiple fronts:
Thank you very much. I checked with another compound (as PE lipid this time) and that worked. It looks like zwitterionic substances cause some problems.
Dear ISICLE-Team,
I installed ISICLE and tried to run it. However, it is only working for neutral molecules. I always get the following error message:
I attached you my input file as well.
Best regards,
Michael
test.txt