cdk / depict

SMILES Depiction Generator
GNU Lesser General Public License v2.1
54 stars 14 forks source link

thoughts about input of .sdf (version 3000) / extended stereochemistry #53

Closed nbehrnd closed 1 year ago

nbehrnd commented 1 year ago

This spurs off of a discussion here and a contribution by @adafede about atom numbering. At present, the program allows input as SMILES, or copy-paste of .sdf. If present and requested, it can can assign and label stereochemistry, too.

Venturing out a couple of variations about and around tartaric acid sketched in DataWarrior (repository on GitHub), I noticed CDKdepict does not process DataWarrior's structures exported in the more recent V3000 dialect of .sdf. Extending the more senior format, this allows to store and process extended stereochemistry as in & and or for a stereogenic centre:

entry_05

Do you think the additional file format and its extended functionality would be worth the effort to be implemented into future versions of CDKDepict?

2022-10-26_stereochemistry_variations.zip

johnmay commented 1 year ago

Depict is primarily for SMILES input and this is possible with the CXSMILES extensions. Try these out.

OC(=O)[C@H](O)[C@H](O)C(=O)O |&1:3|
OC(=O)[C@H](O)[C@H](O)C(=O)O |o1:3|
OC(=O)[C@H](O)[C@H](O)C(=O)O |&1:3,5|
OC(=O)[C@H](O)[C@H](O)C(=O)O |o1:3,5|

I really don't like allow MOLfile and I think the key issue here is we only expect V2000, so in practise it's a simple fix.

johnmay commented 1 year ago

Fixed, will need to shift reload the webpage as the JS needed updating.

Note in SMILES/depict "ABS" is default and not displayed (same as chiral flag=1), if you set chiral_flag=0 will see the different.

Anyways on your examples it's really on the last two where you see it.

Screenshot 2022-10-26 at 13 05 04