skvadrik / re2c

Lexer generator for C, C++, Go and Rust.
https://re2c.org
Other
1.06k stars 169 forks source link

Handle single quotes correctly in the lexer. #482

Closed skvadrik closed 3 weeks ago

skvadrik commented 1 month ago

As mentioned in the comments to https://github.com/skvadrik/re2c/issues/450, different languages use standalone single quotes for different purposes:

All theses cases should be handled by the lexer, so that it doesn't assume that it's the beginning of a string literal.

skvadrik commented 3 weeks ago

Instead of handling each possible case in every language, re2c takes a different approach. It only needs to make sure that a closing brace is not part of a single-quoted char literal or string, so when it encounters a single quote in semantic action, it looks for a string (if the language allows single-quoted strings) or a char literal. If it fails to find one of those, the single quote is ignored. Previously re2c also checked if standalone single quotes are allowed by the language, but there's no need for this check, as re2c does not attempt to parse and validate user code.