Open keinflue opened 1 week ago
@llvm/issue-subscribers-clang-frontend
Author: None (keinflue)
maybe dup: https://github.com/llvm/llvm-project/issues/97741
CC @cor3ntin
I think this is distinct from #97741 -- this is incorrect acceptance of invalid character and string literal tokens whose end cannot be determined due to invalid escape sequences, whereas #97741 is about failing to reject well-formed but invalid UCNs in a string literal that can be tokenized.
After P2621R2, which is a defect report, the following program is ill-formed in C++ (UB beforehand):
This is ill-formed already in translation phase 3 when lexing into preprocessing tokens, because
\N
not followed by{
can't be a named-universal-character and\N
also can't begin any escape-sequence. Therefore'\N'
can't be a character-literal and'
will form a single-character preprocessing token by itself. [lex.pptoken]/2 makes this ill-formed (UB before P2621R2).The C++ status page claims that P2621R2 is already supported, but Clang compiles this without diagnostic (https://godbolt.org/z/WxvcfPj8a).
The same happens with other invalid escape sequences and string literals as well.