Strings are quoted as if by Go's fmt %q operator, which quotes non-printable Unicode code points using \uXXXX or \UXXXXXXXX. But this syntax is not currently recognized by the Starlark scanner, nor does the spec say anything about the form of string literals.
Welcome to Starlark (go.starlark.net)
>>> chr(0x00A0) # NO-BREAK SPACE (non printable)
"\u00a0"
>>> "\u00a0"
"\\u00a0"
>>> chr(0x400) # CYRILLIC CAPITAL LETTER IE WITH GRAVE
"Ѐ"
>>> '\u0x400'
"\\u0x400"
>>> chr(0x0001f63f) # CRYING CAT FACE
"😿"
>>> '\U0001f63f'
"\\U0001f63f"
Contrast with Python3:
Python 3.6.5 (default, Mar 31 2018, 05:34:57)
>>> chr(0x00A0) # NO-BREAK SPACE (non printable)
'\xa0'
>>> '\xa0'
'\xa0'
>>> chr(0x400) # CYRILLIC CAPITAL LETTER IE WITH GRAVE
'Ѐ'
>>> '\u0400'
'Ѐ'
>>> '\U0001f63f'
'😿'
>>> chr(0x0001f63f) # CRYING CAT FACE
'😿'
The Starlark spec and implementations should allow \uXXXX and \UXXXXXXXX escapes within strings, with exactly 4 or 8 hex digits.
Python2 & 3 also accept \xXX escapes, with two hex digits. Should Starlark?
(FWIW: C++ and Go do too; Java does not, and furthermore its \UXXXX notation denotes a UTF-16 code, not a Unicode code point.)
Strings are quoted as if by Go's fmt %q operator, which quotes non-printable Unicode code points using \uXXXX or \UXXXXXXXX. But this syntax is not currently recognized by the Starlark scanner, nor does the spec say anything about the form of string literals.
Contrast with Python3:
The Starlark spec and implementations should allow \uXXXX and \UXXXXXXXX escapes within strings, with exactly 4 or 8 hex digits.
Python2 & 3 also accept \xXX escapes, with two hex digits. Should Starlark? (FWIW: C++ and Go do too; Java does not, and furthermore its \UXXXX notation denotes a UTF-16 code, not a Unicode code point.)