cdk / depict

SMILES Depiction Generator
GNU Lesser General Public License v2.1
55 stars 14 forks source link

missing stereo in CXSMILES depiction #48

Closed egonw closed 2 years ago

egonw commented 2 years ago

Given two (CX)SMILES:

[*]C(=O)[C@@H](N)CCCCN[*] |Sg:n:1,2,3,4,5,6,7,8,9::ht|
C(=O)[C@@H](N)CCCCN

and CIP Stereo Labeling turned on, only the SMILES and not the CXSMILES shows the CIP label:

image

johnmay commented 2 years ago

Polymers and CIP :-).

Unfortunately there are no rules (that I know of) for how CIP applies to polymers, do we weight the DUMMY atom infinitely heavy or light? For now it is safer to give no answer, in this case you get the same answer regardless but not always the case.

If I remember how the code works, if it can split ties before it "reaches" a DUMMY atom it gives you an answer it does, a slight modification shows this works as we expect:

*CC(=O)[C@@H](N)CCCCN* |Sg:n:1,2,3,4,5,6,7,8,9::ht|

The issue is the tie isn't split till you reach the C(=O)(*) at which point it sees dummy atom and gives up. In this case I think it is pos. to prove it's unambiguous - but extra logic is needed.

*C(=O)[C@@H](N)C(=O)*
[Au]C(=O)[C@@H](N)C(=O)[Pb]
[Pb]C(=O)[C@@H](N)C(=O)[Au]

Perhaps the most sane thing to do here is actually to cyclise the structure

C1(=O)[C@@H](N)CCCCN1

*C[C@@H](N)CCCCN*.C[C@@H](N)CCCCN.C1[C@@H](N)CCCCN1

S is correct here as we could also put in duplicates on each end and take the middle value:

egonw commented 2 years ago

Right, yes, I guess that makes sense... I was thinking about the DUMMY being the repeat unit again, but that's not the case for all monomers, of course.

johnmay commented 2 years ago

The centres/cip code is not polymer aware but yeah a reasonable approximation here would be to triplicate and then name it

egonw commented 2 years ago

I consider it a future wishlist

johnmay commented 2 years ago

Could you open issue tagged as feature request on centres repo please

egonw commented 2 years ago

centres repo ?

johnmay commented 2 years ago

https://github.com/SiMolecule/centres/issues