dan2097 / opsin

Open Parser for Systematic IUPAC Nomenclature. Chemical name to structure conversion
https://opsin.ch.cam.ac.uk
MIT License
158 stars 32 forks source link

Unable to parse "(1s,4s)-4-(chlorooxy)cyclohexyl hypofluorite" and similar #231

Open rytheranderson opened 1 year ago

rytheranderson commented 1 year ago

ChemDraw names certain meso cyclic compounds with lowercase r/s, rather than using cis/trans, like in the issue title. For example, the name provided by ChemDraw in the attached figure. image

OPSIN errors with:

Could not find atom that: <stereoChemistry locant="1" type="RorS" value="S" stereoGroup="Abs">1s</stereoChemistry> appeared to be referring to

when attempting to parse this name or similar cases. This may be outside of OPSIN support, but I thought I would mention it as ChemDraw is frequently used for IUPAC naming. Happy to provide additional examples if needed.

Thanks in advance, -Ryther

dan2097 commented 1 year ago

This is indeed a known limitation in OPSIN's stereochemistry support, and has been reported all the way back to 2015 (https://github.com/dan2097/opsin/issues/23). Due to the complexity of implementation, I've been thinking for a while about including John's library for detecting/labelling stereocentres (https://github.com/SiMolecule/centres)... but haven't got around to doing so.