Closed tahonermann closed 3 years ago
This was addressed by the adoption of P2029R4 for C++23. The wording in the current draft for [lex.phases]p5 now states:
Each basic-c-char, basic-s-char, and r-char in a character-literal or a string-literal, as well as each escape-sequence and universal-character-name in a character-literal or a non-raw string literal, is encoded in the literal's associated character encoding as specified in [lex.ccon] and [lex.string].
And a core issue never did get opened for this.
[lex.phases]p5 states:
This wording is incorrect for
u8
,u
, andU
literals (and a little hand wavy for wide literals) since it states they are converted to the execution character set. They should be converted, respectively, to UTF-8, UTF-16, and UTF-32 (and the wide execution character set).@steve-downey requested a core issue to be filed on the core mailing list (http://lists.isocpp.org/core/2019/03/5770.php). The new issue hasn't been opened yet, but is expected to be prior to the 2019 Cologne meeting.