dan2097 / opsin

Open Parser for Systematic IUPAC Nomenclature. Chemical name to structure conversion
https://opsin.ch.cam.ac.uk
MIT License
154 stars 32 forks source link

Galactose #48

Closed dan2097 closed 7 years ago

dan2097 commented 7 years ago

Original report by Andrius Merkys (Bitbucket: merkys, GitHub: merkys).


Galactose should be cyclic, however, it is perceived as a linear molecule:

$ echo galactose | java -jar src/opsin-2.3.0-jar-with-dependencies.jar 
Run the jar using the -h flag for help. Enter a chemical name to begin:
O=C[C@H](O)[C@@H](O)[C@@H](O)[C@H](O)CO
dan2097 commented 7 years ago

Original comment by Daniel Lowe (Bitbucket: dan2097, GitHub: dan2097).


This is intentional. Many sugars exist as a mixture of different forms:

acyclic, 5 membered ring (alpha/beta anomers) or 6 membered ring (alpha/beta anomers)

[cf. https://en.wikipedia.org/wiki/Galactose]

As the acyclic form is always possible [e.g. a tetrose cannot form a pyranose] and avoids the inclusion of the undefined anomeric center, for consistency, this is the form OPSIN always produces.

The cyclic forms can be obtained with names like galactopyranose or galactofuranose. (the anomeric stereochemistry can also be specified e.g. α-D-Galactopyranose)

dan2097 commented 7 years ago

Original comment by Andrius Merkys (Bitbucket: merkys, GitHub: merkys).


Thank you for the information. To me such behaviour seemed rather unexpected.