skvadrik / re2c

Lexer generator for C, C++, Go and Rust.
https://re2c.org
Other
1.06k stars 169 forks source link

are quotes mandatory around literals? #458

Closed lucas-mior closed 9 months ago

lucas-mior commented 9 months ago

Most regex parsers allow you to write a regex like this (maybe without surrounding quotes depending on the language): re = "literal[A-Z]+" You don't have to tell the regex parser that literal is a string literal.

For what I have look in the docs, re2c expects the same regex to be written like: re = "literal" [A-Z]+ Which is confusing, because people writing regexes have to keep this in mind.

Is there a simpler way? Or a tool similar to re2c that might do what I expect? Thanks in advance.

skvadrik commented 9 months ago

Hi, re2c has an option --flex-syntax that allows unquoted string literals.

Similar tool that you may want to consider is e.g. Flex.

lucas-mior commented 9 months ago

It seems that --flex-syntax treats the whole regex as a string literal. Am I missing something?

skvadrik commented 9 months ago

No, --flex-syntax allows character classes and the usual regex operators (alternative, star, plus, etc.). If you have an example grammar that is not working you can post it here and I'll try to help.

pmetzger commented 9 months ago

As a general note, it is traditional that lexical analyzer generators use non-standard regex notation because of the inclusion of macros, the needs of lexers, etc.