Closed eightmm closed 10 months ago
Hi! The explicit H in the REMARK SMILES line are unusual, my first guess is that the error is relates to that. Could you post the input ligand pdbqt as text so i can take a look tomorrow?
@diogomart
REMARK SMILES [H]c1c(N([H])[H])c([H])c2nc3c([H])c(N([H])[H])c([H])c([H])c3c([H])c2c1[H] REMARK SMILES IDX 26 1 2 2 3 3 7 4 12 5 14 6 18 7 20 8 23 9 10 10 9 11 22 12 REMARK SMILES IDX 25 13 11 14 4 15 5 16 6 17 15 18 16 19 17 20 REMARK H PARENT REMARK Flexibility Score: inf ROOT ATOM 1 C UNL 1 8.982 23.182 49.516 1.00 0.00 0.012 A ATOM 2 C UNL 1 9.670 23.649 48.475 1.00 0.00 0.026 A ATOM 3 C UNL 1 10.665 22.851 47.794 1.00 0.00 0.034 A ATOM 4 C UNL 1 10.902 21.573 48.264 1.00 0.00 0.054 A ATOM 5 C UNL 1 9.939 17.864 51.143 1.00 0.00 0.054 A ATOM 6 C UNL 1 9.231 17.253 52.163 1.00 0.00 0.034 A ATOM 7 C UNL 1 8.222 18.030 52.852 1.00 0.00 0.026 A ATOM 8 C UNL 1 7.968 19.299 52.528 1.00 0.00 0.012 A ATOM 9 C UNL 1 8.475 21.272 51.058 1.00 0.00 0.020 A ATOM 10 N UNL 1 10.385 19.776 49.768 1.00 0.00 -0.248 NA ATOM 11 C UNL 1 10.196 21.070 49.336 1.00 0.00 0.073 A ATOM 12 C UNL 1 8.690 19.965 51.464 1.00 0.00 0.001 A ATOM 13 C UNL 1 9.190 21.846 50.020 1.00 0.00 0.001 A ATOM 14 C UNL 1 9.691 19.173 50.793 1.00 0.00 0.073 A ENDROOT BRANCH 3 15 ATOM 15 N UNL 1 11.274 23.326 46.684 1.00 0.00 -0.399 N ATOM 16 H UNL 1 11.077 24.283 46.358 1.00 0.00 0.156 HD ATOM 17 H UNL 1 11.935 22.731 46.164 1.00 0.00 0.156 HD ENDBRANCH 3 15 BRANCH 6 18 ATOM 18 N UNL 1 9.453 15.962 52.502 1.00 0.00 -0.399 N ATOM 19 H UNL 1 10.173 15.416 52.007 1.00 0.00 0.156 HD ATOM 20 H UNL 1 8.902 15.525 53.255 1.00 0.00 0.156 HD ENDBRANCH 6 18 TORSDOF 2
This pdbqt file is the input pdbqt.
The above pdbqt file uses "mk_prepare_ligand.py" to change this sdf file.
Thanks!
Hi again, Could you post the SDF file as a block of code? It needs to be formatted as code so the spaces are preserved. To format text as code use triple backticks before and after the code block. https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-code
1bcu_ligand
Created by X-TOOL on Fri Sep 26 17:34:14 2014
27 29 0 0 0 0 0 0 0 0999 V2000
8.9820 23.1820 49.5160 C 0 0 0 2 0 3
9.6700 23.6490 48.4750 C 0 0 0 2 0 3
10.6650 22.8510 47.7940 C 0 0 0 1 0 3
10.9020 21.5730 48.2640 C 0 0 0 2 0 3
9.9390 17.8640 51.1430 C 0 0 0 2 0 3
9.2310 17.2530 52.1630 C 0 0 0 1 0 3
8.2220 18.0300 52.8520 C 0 0 0 2 0 3
7.9680 19.2990 52.5280 C 0 0 0 2 0 3
8.4750 21.2720 51.0580 C 0 0 0 2 0 3
10.3850 19.7760 49.7680 N 0 0 0 1 0 2
10.1960 21.0700 49.3360 C 0 0 0 1 0 3
8.6900 19.9650 51.4640 C 0 0 0 1 0 3
9.1900 21.8460 50.0200 C 0 0 0 1 0 3
9.6910 19.1730 50.7930 C 0 0 0 1 0 3
11.2740 23.3260 46.6840 N 0 0 0 3 0 3
9.4530 15.9620 52.5020 N 0 0 0 3 0 3
8.2489 23.8205 49.9955 H 0 0 0 1 0 1
9.4780 24.6579 48.1280 H 0 0 0 1 0 1
11.6536 20.9575 47.7829 H 0 0 0 1 0 1
10.7001 17.3042 50.6115 H 0 0 0 1 0 1
7.6572 17.5677 53.6535 H 0 0 0 1 0 1
7.2050 19.8431 53.0728 H 0 0 0 1 0 1
7.7239 21.8637 51.5686 H 0 0 0 1 0 1
11.0771 24.2830 46.3581 H 0 0 0 1 0 1
11.9348 22.7308 46.1644 H 0 0 0 1 0 1
10.1726 15.4160 52.0071 H 0 0 0 1 0 1
8.9024 15.5253 53.2550 H 0 0 0 1 0 1
1 13 4 0 0 1
1 2 4 0 0 1
2 3 4 0 0 1
3 15 1 0 0 2
3 4 4 0 0 1
4 11 4 0 0 1
11 10 4 0 0 1
11 13 4 0 0 1
13 9 4 0 0 1
9 12 4 0 0 1
12 8 4 0 0 1
12 14 4 0 0 1
14 5 4 0 0 1
14 10 4 0 0 1
5 6 4 0 0 1
6 16 1 0 0 2
6 7 4 0 0 1
7 8 4 0 0 1
1 17 1 0 0 2
2 18 1 0 0 2
4 19 1 0 0 2
5 20 1 0 0 2
7 21 1 0 0 2
8 22 1 0 0 2
9 23 1 0 0 2
15 24 1 0 0 2
15 25 1 0 0 2
16 26 1 0 0 2
16 27 1 0 0 2
M END
> <MOLECULAR_FORMULA>
C13H11N3
> <MOLECULAR_WEIGHT>
209.2
> <NUM_HB_ATOMS>
3
> <NUM_ROTOR>
0
> <XLOGP2>
1.99
$$$$
Here!
It looks like the hydrogen count
field of the MOL block is used (page 13 in the specification), and RDKit does not remove hydrogens. I don't know if this is the expected RDKit behavior, but meeko should be checking that those hydrogens were removed, so I think it needs fixing on our end.
So, it turns out that the input SDF has the HCount
field set for the hydrogens, for example for the last atom:
8.9024 15.5253 53.2550 H 0 0 0 1 0 1
^
|
hcount
According to the specification hcount = 1
sets the number of implicit Hs to zero. But because the HCount
field is specified (i.e., has a non-zero value), RDKit adds a query and does not remove the atom by default.
Based on the header it looks like the SD file was written by X-TOOL. Do you know why the hcount
fields were set?
Sorry, I don't know . I just tested it with the files in the PDBbind dataset.
Thanks! The easy fix on meeko's end is to tell RDKit to remove Hs even if they have hcount
queries, but I was concerned that those queries were added on purpose. Glad to hear that they weren't :-)
fixed in ba15b7dc41684405eeebd4b66d6e4a655a9178f2 Thanks for reporting this!
Hi! I tried to convert dlg, the result file of AutoDock-GPU, to sdf by using mk_export.py. At this time, the following error was output.
Looking at DLG and PDBQT, I found that there are 4 hydrogen atoms.
Is this problem caused by rdkit not recognizing hydrogen properly? or is there another reason?
Thanks!