CordyJ / OpenTxl

TXL programming language compiler/interpreter
Other
18 stars 1 forks source link

Latin-1 / UTF-8 clash #4

Closed CordyJ closed 1 year ago

CordyJ commented 1 year ago

TXL cannot handle both ASCII Latin-1 and UTF-8 encodings both at once, since some ASCII Latin-1 characters conflict with UTF-8 prefixes. To fix this for grammars handling Unicode languages, we need to remove the characters â,  and à from the predefined Latin-1 alphabetic characters.

They can be added by the programmer by hand for ASCII Latin-1-only transformations like so:

#pragma -idchars "âÂÃ"