Closed glebm closed 7 years ago
I'm fine with \uXXXXXX
.
Is there a reason why you wouldn't want to allow Unicode escapes in literals? They're basically already in the grammar with Hex
, I just never got to implementing Unicode support before development stopped.
No reason, they should be allowed, that settles it. I didn't realize they currently allow Hex
.
While implementing this, I've realized that there is no way to distinguish between
"\uAAAb" and "\uAAAb" (escape is in bold, the first b
is an actual letter)
While this can be worked around by escaping the b as well, this is trouble for generated grammar files (everything would need to be escaped, or the generator would need to look behind to decide whether to escape).
For this reason, decided on \u{XXXXXX}
.
Good, makes sense.
The syntax I like most is:
Other alternatives are:
And:
In all cases, 1 to 6 hex digits are accepted.
Need to decide whether to support this in literals or only character classes.Unicode escapes should be allowed in literals and character classes.