dan2097 / opsin

Open Parser for Systematic IUPAC Nomenclature. Chemical name to structure conversion
https://opsin.ch.cam.ac.uk
MIT License
153 stars 32 forks source link

Some constructs chemists tend to get wrong #166

Open IngvarLa opened 2 years ago

IngvarLa commented 2 years ago

Not sure if this is within scope for Opsin to handle. Anyway these are examples from patents where the author used a construct to form a valid name, but the structure is not what he intended - verified in several cases by comparing with associated images. The di- and tri-peroxides are may not be stable enough to be identified, let alone isolated.

carbaldehyde chlorooxime ->N-hydroxy carbimidoyl chloride (C=NOCl -> ClC=NOH) US20050075375A1_0308 also hyphenated e.g., benzaldehyde chloro-oxime -> N-hydroxy benzimidoyl chloride US07229987B2_2595 (associated image has an H as II) have not seen this construct for other halides

halo alkanoate -> alkoyl halide (O=COX -> O=CX) e.g., chloro formate -> formyl chloride EP3190114A1_0145 (alkoyl hypohalide tend to be used when actually meaning O=COX)

trialkoxy orthoformate -> trialkyl ortoformate or trialkoxy methane (ROO)3C -> (RO)3C also trialkoxy orthoacetate WO2020212041A1_0151 EP1343782B1_0162

alkylene oxyketal -> alkylene ketal (RC1(R)OO[R1]OO1 -> RC1(R)O[R1]O1) R1 usually ethylene or propylene EP0010320A1_0023

None of these are particularly common, but there seem little risk that incorrectly change an intended name except maybe for the halo alkanoate

dan2097 commented 2 years ago

The first case is definitely in scope, as a chlor(o)oxime appears to be a commonly used term to describe compounds containing RC(Cl)=NOH

The other 3 are more tricky as they definitely are mistakes in the name, the correctly constructed names are significantly more common. I'll get the first case fixed soon, not yet sure about the other cases. Incidentally similar issues cropped up when interpreting carbohydrate line formulas, sometimes an oxygen would be explicitly part of the formula and sometimes it wouldn't, and hence special casing was required so that if the oxygen was specified you didn't end up with a peroxy.