dan2097 / opsin

Open Parser for Systematic IUPAC Nomenclature. Chemical name to structure conversion
https://opsin.ch.cam.ac.uk
MIT License
147 stars 32 forks source link

Carbohydrate locant improvements: 5a-Carba-β-D-glucopyranose #221

Open rogersayle opened 1 year ago

rogersayle commented 1 year ago

Hi Daniel, Is it possible (trivial) to treat locant 5a as a synonym for locant O5 in carbohydrate/saccharide nomenclature? IUPAC's 2-Carb-32.2 [https://iupac.qmul.ac.uk/2carb/34.html] recommends the name "5a-Carba-β-D-glucopyranose" which currently results in "Could not find the atom with locant 5a". For the (incorrect) variant "5-Carba-β-D-glucopyranose" OPSIN reports "The replacement term carba was used on an atom that already is a C", but a possible work around is "O5-Carba-β-D-glucopyranose" which produces the correct structure, but unfortunately the new 5a locant is then not available for substitution, i.e. (5aR)-5a-chloro-5a-carba-β-D-glucopyranose. Likewise, for 4'a-carbathymidine. Thanks in advance.

dan2097 commented 1 year ago

That looks straightforward to support.

I note that the current blue book appears to not prefer this style of name P-104.2.2 Cyclitols, with the exception of inositols, are named systematically from cyclohexane as the parent using the CIP method and its Sequence Rules for describing stereoisomers. This method is preferred to the method of positional numbers described in P-104.2.3.

Regarding 4'a-carbathymidine, this is slightly more tricky due to an unrelated issue. OPSIN (erroneously) assumes that the primed version of 4a would be 4a', while it should be 4'a (probably allowing either would be safe)

dan2097 commented 1 year ago

Just to give a brief update, this isn't as straight forward to support as I anticipated due to how thio sugars are handled: https://iupac.qmul.ac.uk/2carb/14n15.html#15

Would you expect 5a-Thio-β-D-glucopyranose to be an error or a synonym of 5-Thio-β-D-glucopyranose