cqcallaw / newt

The newt programming language
GNU General Public License v3.0
12 stars 2 forks source link

Unicode support #45

Open cqcallaw opened 8 years ago

cqcallaw commented 8 years ago

Possibly via http://site.icu-project.org/

cqcallaw commented 7 years ago

The syntax and grammar of the language are so Latin-based that using Unicode for operators and identifiers seems counter-productive. Will need to handle Unicode string literals and streams, however.

cqcallaw commented 7 years ago

For string constants, lexing might be an issue: http://stackoverflow.com/a/935158/577298

cqcallaw commented 7 years ago

Iteration reference: http://stackoverflow.com/questions/4579215/cross-platform-iteration-of-unicode-string-counting-graphemes-using-icu#4579312

cqcallaw commented 7 years ago

Another interesting project for Unicode support: https://github.com/JuliaLang/utf8proc

cqcallaw commented 7 years ago

See also: https://www.w3.org/International/wiki/Case_folding

cqcallaw commented 7 years ago

UTF-8 code points may be represented by the byte type, but representing graphemes as a language primitive would require another primitive type.