c3d / xl

A minimalist, general-purpose programming language based on meta-programming and parse tree rewrites
GNU General Public License v3.0
270 stars 15 forks source link

Better classification of Unicode characters #21

Open c3d opened 4 years ago

c3d commented 4 years ago

The current implementation reads its input in Unicode UTF-8 format, and makes crude attempts at accepting Unicode.

This was good enough for Tao3D to deal with multi-lingual text, including in languages such as Hebrew or Arabic. However, that implementation is a bit naive with respect to distinguishing Unicode letters from non-letter characters.

For example, 𝝿_2 or étalon are valid XL names, and this is intentional, but ⇒A2 is presently a valid XL name, and this can easily be considered a bug.