goffrie / plex

a parser and lexer generator as a Rust procedural macro
Apache License 2.0
409 stars 26 forks source link

Naming regex patterns? #41

Open osa1 opened 4 years ago

osa1 commented 4 years ago

Suppose I have this lexical syntax for a token:

Identifier ::= <Initial> <Subsequent>*
Initial ::= a..z
Subsequent ::= Initial | ...

Here the Initial part is used in both Identifier and Subsequent. As far as I understand, currently the only way to do this in plex's lexer! is by duplicating Initial part, something like:

lexer! {
    ...
    r"[a-z]([a-z]|[<subsequent>])*" => ...
}

In the simplified example above the repetition of [a-z] is not too bad, but in the actual use case the syntax is much more complex and the repetition is a real problem.

Ideally I should be able to give this regex [a-z] a name and use it in other regex patterns. Maybe something like:

lexer! {
    ...
    let initial = "[a-z]"
    let subsequent = "..."
    r"$initial($initial|[$subsequent])*" => ...
}

Is this currently possible with plex? If not I think this would be a useful addition to it.

goffrie commented 4 years ago

This isn't possible today. It's a good idea though! Unfortunately I don't have much time these days to work on plex :(