IUPAC-InChI / InChI

Main InChI repository
https://iupac-inchi.github.io/InChI-Web-Demo/
MIT License
69 stars 9 forks source link

Chiral spiro compounds are labelled achiral, InChI generated is incorrect #40

Open supersciencegrl opened 3 months ago

supersciencegrl commented 3 months ago

I am working from the InChI Web Demo (https://iupac-inchi.github.io/InChI-Web-Demo/), and I originally found this issue when creating InChI using rdkit in Python, and the same occurs when I try to generate InChI from structure in ChemDraw.

C2-symmetrical spiro compounds which contain a single stereogenic spiro atom seem to (sometimes or always?) be labelled as achiral, and the InChI generated does not contain point stereochemistry.

Here is an example. The first molecule is (R)-SDP, and the second is (S)-SDP. They have different structures, and the point stereochemistry is represented accurately by SMILES. However, when I try to generate an InChI in any of the three above systems, the stereochemistry is lost. C12=CC=CC(P(C3=CC=CC=C3)C4=CC=CC=C4)=C1[C@@]5(CC2)CCC6=C5C(P(C7=CC=CC=C7)C8=CC=CC=C8)=CC=C6 C12=CC=CC(P(C3=CC=CC=C3)C4=CC=CC=C4)=C1[C@]5(CC2)CCC6=C5C(P(C7=CC=CC=C7)C8=CC=CC=C8)=CC=C6

The log on the InChI Web Demo states: "InChI options: Warning (Not chiral)"

However, InChI is generated correctly for the topologically-similar compound, (R)-ShiP. InChI=1S/C23H19O3P/c1-2-8-18(9-3-1)24-27-25-19-10-4-6-16-12-14-23(21(16)19)15-13-17-7-5-11-20(26-27)22(17)23/h1-11H,12-15H2/t23-/m0/s1

fbaensch-beilstein commented 3 months ago

Dear @supersciencegrl,

we will have a closer look to this soon. Could you please provide us the mol files for (R)-SDP and (S)-SDP you have used.

gblanke02 commented 3 months ago

Dear Nessa,

The molfiles are useful for us to avoid any issues that may be related to the coordinates calculation from the smiles representation to the molfile.

My first play around with my version of the molfile from the smiles string showed that it works as soon as one of the P(Ph)3 groups are slightly modified so that the molecule becomes asymmetric. But obviously, we have a problem with the high symmetry of the 2 parts of the spiro center. By the way, your structure comes in right in time because we have to work on the further enhancements of stereochemistry. Obviously we have to add larger spiro compounds to our considerations.

Best wishes

Gerd

supersciencegrl commented 3 months ago

Hi team,

Thanks for looking at it! I appreciate all your work. I'll try and come along (online) to the InChI discussion on Saturday.

Here's the 2D molfile I tried (this one created in ChemDraw; I had the same behaviour when just using a SMILES string). Either this or the enantiomer perform exactly the same way using rdkit in Python.


untitled.mol ChemDraw08122419562D

43 50 0 0 1 0 0 0 0 0999 V2000 -0.7846 -1.0799 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.4991 -1.4924 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -2.2136 -1.0799 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -2.2136 -0.2549 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.4991 0.1576 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.7813 0.9328 0.0000 P 0 0 0 0 0 0 0 0 0 0 0 0 -1.2510 1.5648 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.5331 2.3400 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.0028 2.9720 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.1904 2.8288 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0918 2.0535 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.4385 1.4215 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -2.5937 1.0761 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -3.1240 0.4441 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -3.9365 0.5873 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -4.2187 1.3626 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -3.6884 1.9946 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -2.8759 1.8513 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.7846 -0.2549 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.4849 0.6674 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 1.3349 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.7846 1.0799 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.7846 0.2549 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1.4991 -0.1576 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1.7813 -0.9328 0.0000 P 0 0 0 0 0 0 0 0 0 0 0 0 2.5937 -1.0761 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.8759 -1.8513 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.6884 -1.9946 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.2187 -1.3626 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.9365 -0.5873 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.1240 -0.4441 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1.2510 -1.5648 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.4385 -1.4215 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.0918 -2.0535 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.1904 -2.8288 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1.0028 -2.9720 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1.5331 -2.3400 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.2136 0.2549 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2.2136 1.0799 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1.4991 1.4924 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.4849 -0.6674 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 -1.3349 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 0 2 3 1 0 0 3 4 2 0 0 4 5 1 0 0 5 6 1 0 0 6 7 1 0 0 7 8 2 0 0 8 9 1 0 0 9 10 2 0 0 10 11 1 0 0 11 12 2 0 0 7 12 1 0 0 6 13 1 0 0 13 14 2 0 0 14 15 1 0 0 15 16 2 0 0 16 17 1 0 0 17 18 2 0 0 13 18 1 0 0 5 19 2 0 0 1 19 1 0 0 20 19 1 1 0 20 21 1 0 0 21 22 1 0 0 22 23 1 0 0 23 24 2 0 0 20 24 1 0 0 24 25 1 0 0 25 26 1 0 0 26 27 1 0 0 27 28 2 0 0 28 29 1 0 0 29 30 2 0 0 30 31 1 0 0 31 32 2 0 0 27 32 1 0 0 26 33 1 0 0 33 34 2 0 0 34 35 1 0 0 35 36 2 0 0 36 37 1 0 0 37 38 2 0 0 33 38 1 0 0 25 39 2 0 0 39 40 1 0 0 40 41 2 0 0 23 41 1 0 0 20 42 1 6 0 42 43 1 0 0 1 43 1 0 0 M END

gblanke02 commented 3 months ago

Hi Nessa,

Thanks for the molfile

We like this compound. Neither Biovia/Draw, nor Marvin seem to be able to recognize the stereochemistry of this compound. It is a “wonderful” test case. Does ChemDraw recognize it on your side?

Best wishes Gerd

supersciencegrl commented 3 months ago

Interesting! ChemDraw can both recognize the above molfile corresponds to a chiral compound, and assign the correct (R)-descriptor. However, (off topic but) there are similar compounds that ChemDraw recognizes as chiral but for some reason cannot assign - (R)-ShiP is one. InChI=1S/C23H19O3P/c1-2-8-18(9-3-1)24-27-25-19-10-4-6-16-12-14-23(21(16)19)15-13-17-7-5-11-20(26-27)22(17)23/h1-11H,12-15H2/t23-/m1/s1

nbehrnd commented 3 months ago

@supersciencegrl The simple copy-paste (like a text) of the .sdf/.mol file scrambles the format of the file. This adds an obstacle for processing down the road, e.g.

$ obabel untitled.mol -osdf
==============================
*** Open Babel Warning  in ReadMolecule
  WARNING: Problems reading a MDL file
1 2 2 0 0
Invalid bond specification, atom numbers or bond order are wrong;
each should be in a field of three characters.

0 molecules converted

because it anticipates a format of 7(I3) for the connectivity table. One solution to this is to enclose the copy (like a code block) by a leading and trailing line of three back ticks/accents grave. Else (e.g., long log files of MOPAC/Gaussian etc), either the attachment after the addition of a .txt file extension, or joining multiple files into a .zip archive the paste/drop/attach on GitHub equally accepts (caveat: not in replies via email, only from the session in the web browser).

Example water


 OpenBabel08132410443D

  3  2  0  0  0  0  0  0  0  0999 V2000
    0.9444    0.0690   -0.0831 O   0  0  0  0  0  0  0  0  0  0  0  0
    1.9123    0.0601   -0.0375 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.6655   -0.1094    0.8276 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  1  3  1  0  0  0  0
M  END
$$$$

Edit: plus the fenced code block has the paperclip in the top right corner.

gblanke02 commented 3 months ago

Unfortunately, all the leading blanks in each line are deleted so that the molfile is corrupted. Perhaps, we may get this structure as a mol-file attached.

Best wishes Gerd

fbaensch-beilstein commented 3 months ago

Since github does not accept mol files as an attachment, just change the extension into .txt. That would be very helpful.

supersciencegrl commented 3 months ago

That makes sense! molfile attached in .txt format. (R)-SDP.txt

JanCBrammer commented 2 months ago

(Partly) replicated with https://github.com/IUPAC-InChI/InChI/blob/290f5478c0867403dd0d79402892773efee66ce6/INCHI-1-TEST/tests/test_executable/test_github_40.py.