cdk / depict

SMILES Depiction Generator
GNU Lesser General Public License v2.1
54 stars 14 forks source link

Bug in depicting CXSMILES with Stereochemistry #66

Closed Kohulan closed 1 year ago

Kohulan commented 1 year ago

When using CXSMILES with coordinates to depict an image the stereochemistry is ignored. But if I try to generate an absolute SMILES back from the parsed SMILES I could see that the stereochemistry is still preserved.

C[C@@H](C1=CC=CC=C1)NC(=O)[C@H](C2=CC=CC=C2)NC(=S)NC3=CC=CC=C3 |(-0.5,2.0,;-2.0,1.9,;-2.7,0.6,;-4.2,0.6,;-4.9,-0.8,;-4.1,-2.0,;-2.6,-2.0,;-1.9,-0.7,;-2.8,3.2,;-2.1,4.5,;-0.6,4.6,;-2.9,5.8,;-4.4,5.8,;-5.2,7.0,;-6.7,7.0,;-7.4,5.7,;-6.6,4.4,;-5.1,4.4,;-2.2,7.1,;-3.0,8.4,;-4.5,8.4,;-2.3,9.7,;-3.1,11.0,;-4.6,10.9,;-5.4,12.2,;-4.7,13.5,;-3.2,13.6,;-2.4,12.3,)|
image
nbehrnd commented 1 year ago

Suggestion: change the level of the second pull-down menu; instead No Annotation, use CIP Stereo Label. This annotates two centres as S configurated:

with_labels

(Because an export to .svg is one option, the string below the structure formula can be removed later.)

Kohulan commented 1 year ago

Thank you but there does not appear to be a problem with annotation, and CIP annotation does not seem to be relevant in this case. The CXSMILES depiction should get displayed like this:

image
johnmay commented 1 year ago

The issue is if you give it coordinates, it expects to have the non-planar bonds (up/down wedges) annotated as well. See here: https://docs.chemaxon.com/display/docs/chemaxon-extended-smiles-and-smarts-cxsmiles-and-cxsmarts.md#src-1806633-safe-id-q2hlbuf4b25fehrlbmrlzfnnsuxfu2fuzfnnqvjuuy1dwfnnsuxfu2fuzenyu01bulrtlvnpbmdszsjvcg9yrg93biiov2lnz2x5ksxvugfuzerpv05ib25kcw

johnmay commented 1 year ago

I'm not sure if we currently pass/set them from the CXSMILES but it's certainly possible. The other option is we assign the up/down bonds if they are missing but I'm less keen on that.

johnmay commented 1 year ago

in short you would ideally need: wU: wD: etc in the CXSMILES layers, this doesn't currently work though.

nbehrnd commented 1 year ago

The wedges, ... I see. No problem resolution, but a couple of observations (Python 3.11.2, RDKit 202209.3-1 as provided by Debian 12/bookworm):

Below a screen photo about cdkdepict (ferrocene, one of ChemAxxon examples for bridged structures; mandelic acid, and with your structure expressed by OpenBabel's canonical SMILES -- the later two cxsmiles generated with the Jupyter notebook attached in the .zip):

testing

test_cxsmiles.ipynb.zip

Kohulan commented 1 year ago

@johnmay Thanks a lot.

I'm not sure if we currently pass/set them from the CXSMILES but it's certainly possible. This could be the best option I agree. Since we tend to use the CDK depiction quite a lot it would be great if we fix this internally.

@nbehrnd Thank you for the detailed information. I was able to parse the SMILES string I provided with RDKIT without any issues, and the molecule is displayed in the exact coordinates provided.

image
johnmay commented 1 year ago

Closing as won't fix, perhaps in future but not now. Please feel free to send patch, i've pointed to what needs to change. All the best.