Open williamsharkey opened 3 years ago
Follow up for anyone who has this issue in the future: it may not be the most pretty solution, but redefining the comment literal *worked for me*.
My fork just adds one small function:
// Set a custom comment character. Default is #
func CommentCharacterSet(s string) {
rComment.Ope = Seq(Lit(s), Zom(Seq(Npd(&rEndOfLine), Dot())), &rEndOfLine)
}
Before parsing, call peg.CommentCharacterSet(":note:")
if you want your comments to start with :note:
, for example.
Doing so allowed me to define precedence for #OR# operator just fine.
@williamsharkey, thanks for the feedback. The '#' comment format is actually from the original Bryan Ford's original paper. Here is the excerpt of the PEG grammar on the 2nd page of the paper:
Spacing <- (Space / Comment)*
Comment <- ’#’ (!EndOfLine .)* EndOfLine
Space <- ’ ’ / ’\t’ / EndOfLine
EndOfLine <- ’\r\n’ / ’\n’ / ’\r’
EndOfFile <- !.
Of course it's OK to modify the PEG grammar on your own projects, but I wonder if it's a correct way to solve this issue because it breaks the original PEG grammar...
Instead of touching the comment format, I tried to allow literal operators in the parsing infix expression format in cpp-httplib (not go-peg, sorry...), and here is the commit. Users can now specify operators like #plus#
and #multiply#
.
Here is the actual cpp-peglib's example:
START <- _ EXPRESSION
EXPRESSION <- ATOM (OPERATOR ATOM)* {
precedence
L '#plus#' - # weaker
L '#multiply#' / # stronger
}
ATOM <- NUMBER / T('(') EXPRESSION T(')')
OPERATOR <- T('#plus#' / '#multiply#' / [-/])
NUMBER <- T('-'? [0-9]+)
~_ <- [ \t]*
T(S) <- < S > _
As you can see, we can still use both '#' comment and operators with '#'. I don't have enough time to implement the similar in go-peg though, I hope it will be helpful for you.
I need to parse "a#OR#b". In the grammar, I can match #OR# just fine, by using the literal '#OR#'.
Because #OR# is a binary operator amongst other binary operators, I need to specify precedence.
In the options section, I can't figure out how to escape the hash. I'm assuming the escape sequences \# (and perhaps tilda ~# ?) don't work here.
I would try to add an escape method myself to the options parser, and send a pull request, but I don't actually understand these defintion in parser.go well enough. I think it may involve modifying these lines:
https://github.com/yhirose/go-peg/blob/a1af152bac31a6323c2d4b8870ceff8520c93fb4/parser.go#L122-L125
( Side note: Do people actually write grammars in this functional style, or is this automated output from this tool? Seems difficult for humans to read, at least me. )
Because this repo is deprecated in favor of the C++ one, I don't expect assistance. Instead I am going to fork this repo and add a method for changing the comment character ('#') to a user defined comment character. IE, a function that re-defines rComment.Ope.
https://github.com/yhirose/go-peg/blob/a1af152bac31a6323c2d4b8870ceff8520c93fb4/parser.go#L102
I know that redefining the comment character isn't the best way to fix this, but I'm pretty confident it will work for what I need. If anyone more experienced feels like showing me how to escape sequences in the precedence option section, that would be great too.