mna / pigeon

Command pigeon generates parsers in Go from a PEG grammar.
BSD 3-Clause "New" or "Revised" License
822 stars 66 forks source link

Use regexp in grammar rules? #141

Closed xrstf closed 7 months ago

xrstf commented 7 months ago

My grammer has rules like

Identifier <- [a-zA-Z_+/*_%?-][a-zA-Z0-9_+/*_%?!-]* {
  ....
}

I would like to make this regular expression available as Go variable. From what I can see, Pigeon disects the pattern and generates some custom stuff. I tried to define a regexp in the initializer

{
package parser

import "regexp"

var identifierPattern = regexp.MustCompile(`[a-zA-Z_+/*_%?-][a-zA-Z0-9_+/*_%?!-]*`)
}

Identifier <- identifierPattern {
   ...
}

but this does not yield the expected result.

Is there a way to maintain the regexp just once in my codebase, instead of leaving a comment like

// If you update this rule, you must also update the variable XYZ over there!!
Identifier <- ....
breml commented 7 months ago

No, there is no way to reuse such a rule as regex in Go code, since PEG and regex are two distinct things. The only idea, that comes to mind is to have multiple entry points for your grammar and instead of using regex, use the PEG parser in the places, where you currently use regex.

See -alternate-entrypoints=RULE[,RULE...] flag.

xrstf commented 7 months ago

Thanks, then I'll just leave copious amounts of comments to prevent me from f*ing up in the future 😁