Closed rapodaca closed 2 years ago
Points of incompatibility. These are features explicitly defined in one or the other paper, but which are different in Dialect:
Ha
,
(JCICS)--
; ++
; +++
; etc.:
*((*))*
; *(((*)))*
; etc.*>>*
@
and @@
are disallowed
@@@
and @@@@
(note recursion in grammar)@1
@AL2
, @AL2
, @SP1
, etc.[HH]
is valid as are [HH2]
and other chemically nonsensical constructs. Contradicts JCICS..
is NOT a "bond of formal order zero" (p 88, bottom). It is not a bond at all. . This point is contradicted elsewhere, so must be clarified.Points not explained by either paper, but in Dialect:
Contradictory points resolved in Dialect:
H
is NOT in the "aromatic subset." See Hn1cccc1
, JCICS p 35 bottom left, contradicts p32 top right.Table of differences:
Feature | SMILES | Dialect |
---|---|---|
element symbol Ha | accepts | rejects |
element symbols Db; Sg; Bh; Hs; Mt; Rg; Cn; Nh; Fl; Mc; Lv; Ts; and Og | rejects | accepts |
future element symbols approved by IUPAC | rejects | accepts |
comma symbol (, ) |
may accept | rejects |
multiple branching e.g., *((*))* ) |
accepts | rejects |
reactions using greater than symbol (> ) |
accepts | rejects |
extended stereodescriptors e.g., @@@ , @@@@ , @AL1 , @1 , and @SP1 |
accepts | rejects |
use of stereodescriptors on odd cumulene centers | accepts | rejects |
virtual hydrogen count on hydrogen | probably rejects | accepts |
detachments are bonds of "formal order zero" | probably accepts | rejects |
upper and lower bounds on atomic properties | rejects | accepts |
nitrogen default valence includes 5 | partially accepts | accepts |
unbracketed hydrogen atom | partially accepts | rejects |
acyclic atom selection | partially accepts | accepts |
The ms makes several references to "dialect" in the linguistic sense and this is of course the code name for the language. But the goal is not to make yet another dialect. The goal is to for the first time fully define a language that functions as a subset of SMILES-as-practiced. No extensions. No pet nice-to-haves. But a subset to the extent it's possible without internal inconsistencies.
Parts to improve: