Open pskocik opened 4 weeks ago
@llvm/issue-subscribers-clang-frontend
Author: None (pskocik)
Looks like the reason it works with _
is because 1_
is lexed as an integer literal w/ a (UDL) suffix of _
: https://godbolt.org/z/echxb68W6.
1foo
is a valid pp-number in both C and C++ (regardless of UDL)
(basically anything that starts with a digit is a pp-number https://eel.is/c++draft/lex.ppnumber#nt:pp-number )
This let the preprocessor not care too much about parsing number (and some of these pp-numbers can indeed end up being valid UDL)
Note that 1$
is not valid outside of the preprocessor but X1$
ought to be (in modes where $
are valid in identifiers)
It's also true that we have a bunch of UDL related bugs https://godbolt.org/z/WsxncWeYM
That being said, I do believe encouraging the use of $
in identifiers is not advisable given the negative impact that has on the evolutivity of C++, so I don't know if we want to expand a lot of energy fixing these corner cases with the use of $
.
Looks like the reason it works with
_
is because1_
is lexed as an integer literal w/ a (UDL) suffix of_
: https://godbolt.org/z/echxb68W6.
Looks like it works with any [0-9A-Z_a-z]*
suffix (even empty) (same on other C compilers), but fails when a token is formed that starts with a digit and has any $ in it, even if later concatenation makes it not start with a digit (and such a final form of a token would otherwise be accepted if inputted directly). https://godbolt.org/z/v715E3n8n
I was testing with (and am interested in) the C frontend but looks like it behaves the same with -xc++.
This is probably an extension, but all gcc, tinycc, and clang support transient weird token in preprocessor token concatenation, e.g.,
works on all three even though it transiently creates the weird token 1_.
On clang, unlike the other two, this does not work with $ in place of _, or anything containing $ in the suffix.
Could be worth fixing. It's kind of a weird inconsistency, even within clang itself.
https://godbolt.org/z/YGzd1h1Gc