rui314 / 8cc

A Small C Compiler
MIT License
6.13k stars 740 forks source link

UTF8 Characters not supported #47

Closed sebastien closed 9 years ago

sebastien commented 9 years ago

The character is UTF8, supported by clang and gcc with -std=c11.

$ echo "typedef long Itid;" > utf8.c ; 8cc -c utf8.c[ERROR] 
parse.c:2651: utf8.c:1:14: stray character in program: '�'
rui314 commented 9 years ago

We need to support UTF-8 characters in string literals or character literals. But it's probably out of scope of a small compiler to support UTF-8 in identifiers. Support multi-byte characters are hard, so I'll leave it alone.

rui314 commented 9 years ago

It's supported in d5fc4b63229ceca3e136654c16bd6676f705b57b.