Closed eemeli closed 1 year ago
Back in the Fluent times, our hope was that most Unicode characters would be written verbatim, as the Unicode glyphs themselves. If the translation wants to use 😀, then it should just use that particular glyph, rather than escape it as \U01f600
.
The main motivation behind adding Unicode escape sequences was then to make it possible to make non-printable or invisible characters stand out in the translation's source. The most notable example was the non-breaking space.
I'm providing this little bit of historical context in order to advocate against option (3).
The solution in #11 is a combination of 1. and 2., and adds to those escapes for spaces, tabs, and relevant syntax characters. It also specifically provides for escape sequences defined in MF2 to not need double-escaping.
Some characters are not easily visually representable or input by a user. To represent them in a message resource, it should be possible to escape them using their Unicode code points.
Possible solutions:
\
escape codes with hexadecimal values, such as:\xab
,\uabcd
, and\Uabcdef
.\n
,\r
, and\t
.