ned14 / pcpp

A C99 preprocessor written in pure Python
Other
215 stars 39 forks source link

Token pasting operator ## always generates ID token #19

Closed Sei-Lisa closed 5 years ago

Sei-Lisa commented 5 years ago

When using the token pasting operator (##), the resulting token always has type ID. This is a problem in expressions. Test case:

#define PASTE(x, y) x ## y
#if PASTE(1, 2) == 12
  works
#else
  fails
#endif

Both gcc and mcpp generate 'works'. pcpp generates 'fails'. The cause seems to be that the type of token generated is always CPP_ID, therefore the pasted token 12 is replaced with 0 because it's interpreted as an identifier.

I guess that a solution would be to convert the tokens to string before concatenating them, and lex the resulting string again. Now, if the result is not 1 token, an error should be emitted, e.g. PASTE(+,-) is not correct; however I don't think that pcpp can catch that, because it doesn't seem to recognize tokens like ++ or *= which are valid, therefore it could fail on valid code if it did.

ned14 commented 5 years ago

Fixed. Thanks for the bug report.