Open victorhooi opened 9 years ago
The only problem that I can see is my peg doesn't support
named:$(captures). My peg only supports
The easiest way to implement import would be to add an export feature to peg.js.
On Mon, Jul 20, 2015 at 4:27 PM, Victor Hooi notifications@github.com wrote:
I'm wondering how feasible it would be to port a peg.js grammar file to this (peg)?
I'm new to PEGs, but just skimming through the docs, it seems like it should be easy just to translate each line 1:1 - is there anything we need to be wary of?
Curious if it would ever be possible to have a feature to automatically import a grammar file from peg.js? =) (i.e. if this is something that can easily be automated).
— Reply to this email directly or view it on GitHub https://github.com/pointlander/peg/issues/40.
Aha, so for the capture groups, I can just use positional capture groups, rather than named capture groups, right?
The README doesn't say too much on how to extract the values - but I gather it's shoved in a variable called "buffer", which I can then read?
Do you think named capture groups be addd to peg at some point?
For lack of named capture groups, it's default to process nested grammar structure, e.g. protobuf
or thrift
Do you mind providing an example of how to emulate named capture groups? Coming from peg.js, this has been a sticking point for me. I really want to use this library.
Here's an example:
Assignment = key:Identifier _ "=" _ value:Value {
return {
type: "assignment",
key: key,
value: value,
}
}
How would I parse key
and value
without capture groups?
You can use AST in which case everything is captured. Example code: https://github.com/pointlander/peg/tree/master/grammars/calculator_ast
Thanks for your response! I did look at the calculator example and tried to figure out how to apply it. The part that I can't wrap my head around is how to handle capture groups that are not leaf nodes.
In the example above, how to pull out Identifier
and Value
which are themselves a subtree with child nodes.
Split it up into multiple rules?
(untested)
type Example Peg {
id, value string
}
assignment = identifier "=" value { p.AddAssignment(p.id, p.value) }
identifier = < .+ > { p.id = text }
value = < .+ > { p.value = text }
Ahh that's interesting, thanks for sharing. I think the missing piece for me is an understanding of how to build a tree using a stack, specifically that p.AddAssignment
and how to build down (or up?) from there. Any pointers or resources on how to learn more would be greatly appreciated.
Looking at https://github.com/pointlander/peg/blob/master/peg.peg it appears that nodes are getting pushed on, but it's not clear to me how to make sense of this.
peg.peg looks like it makes heavy use of <...>
and text
to build the Peg tree.
In my example above, identifier
is parsed ("matched"?) first, and thus p.id
is assigned, and then value
is parsed, and thus p.value
is assigned, and then the assignment
rule is matched, and so p.AddAssignment(...)
runs.
(Also, since AddAssignment
's two parameters are also fields of p
, you could just drop them and have AddAssignment
reference p.id
and p.value
directly.)
Does that make sense? Printf
is your friend, and the generated Go state machine is also useful in figuring out what gets called where and how. For example this line in peg.peg matches up directly to this line in peg.peg.go.
HTH.
So ... I just realized that the file I linked you to is the file you were already looking at. Sorry about that. But maybe with the rest of my comment, you can make sense of it.
The suggestion to use fmt.Printf
to see the execution order is incredibly helpful!
Also I realized after staring at the syntax more that you can add multiple captures within a single line. For example:
https://github.com/pointlander/peg/blob/master/peg.peg#L22-L26
I thought it was only for OR expressions (e.g. Text / Attribute
). This appears to cover the need for named capture groups that peg.js supports since you can always break an expression up.
Thanks for helping me understand this better!
Quick followup: This project finally clicked for me 🙂
I'm working on a route parser similar to similar to https://github.com/pillarjs/path-to-regexp in Express.js. The parser's state looks like this:
type parser Peg {
route *Route
segments []Segment
key Key
rkey *RegexpKey
bkey *BasicKey
optional *OptionalKey
slash *Slash
regexp *Regexp
identifier *Identifier
text *Text
}
With the help of go-naturaldate, I realized that you need to treat those fields (segments, keys, etc.) as temporary scratch state.
Then you rely on the execution order of the algorithm to fill them in before they're needed. The fmt.Printf
advice helped me figure that out.
For anyone else that struggled to make the leap from peg.js like me, here's a full example of the route parser. This example includes a peg.js grammar that I used to first prototype the parser.
Thanks for your patience and assistance. This is a very cool project!
I'm wondering how feasible it would be to port a peg.js grammar file to this (peg)? Example:
https://raw.githubusercontent.com/mongodb-js/log/88daf5f4259652f95f2d4a7a01bd3f14751e4812/mongodb-log.pegjs
I'm new to PEGs, but just skimming through the docs, it seems like it should be easy just to translate each line 1:1 - is there anything we need to be wary of?
Curious if it would ever be possible to have a feature to automatically import a grammar file from peg.js? =) (i.e. if this is something that can easily be automated).