Closed srackham closed 2 months ago
You're right, that's a bug. It should handle UTF-8 in string literals, but it does not yet.
I was planning on improving the support for unicode by changing the string
type (currently an alias for Array<byte>
), but this is something that could maybe supported by just allowing the UTF-8 representation through.
Thanks.
A workaround is to convert UTF-8 strings to hex byte values with, for example:
$ echo -n "Hello World ©" | od -A n -t x1 | tr -d '\n' | sed 's/ /\\x/g'
\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x20\xc2\xa9
Does Virgil support UTF-8 string literals?
The documentation suggests it does: https://github.com/titzer/virgil/blob/3038dead280099b736f312e2b091b053cb0cfbf7/doc/lib-issues.txt#L116
Here I've inserted the copyright character in a string literal:
Hex byte values work though: