pointlander / peg

Peg, Parsing Expression Grammar, is an implementation of a Packrat parser generator.
BSD 3-Clause "New" or "Revised" License
1.01k stars 120 forks source link

Execute user code in Parse() rather than Execute() #53

Open wvxvw opened 8 years ago

wvxvw commented 8 years ago

I was hoping to use this parser generator to parse YAML, but to do this, I need to depend on indenting while matching rules. Executing user code in Execute() thus is too late for me.

Is it possible, and if not, how difficult would it be to add functionality needed to do something that'd enable early code execution?

wvxvw commented 8 years ago

Or, will &{ ... } actually do it? If so, it'd probably help to update the documentation on the landing page.

pointlander commented 8 years ago

Some PEGs do support early code execution (http://piumarta.com/software/peg/), but I don't think this is a good idea in general because PEGs do backtracking. Example: a <- { code } 'a' / 'b' with early code execution 'code' would be executed even if the next character isn't 'a'. You might want to use: https://github.com/go-yaml/yaml You could still use my implementation of peg by handling the significance of white space after parsing. You could call AST() and then do a transform on the ast as a function of white space.

On Sat, Aug 20, 2016 at 1:25 PM, wvxvw notifications@github.com wrote:

I was hoping to use this parser generator to parse YAML, but to do this, I need to depend on indenting while matching rules. Executing user code in Execute() thus is too late for me.

Is it possible, and if not, how difficult would it be to add functionality needed to do something that'd enable early code execution?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pointlander/peg/issues/53, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPZFBBFFoYZ7PtWaAkpNAmiWs0bPnTOks5qh1TFgaJpZM4JpJTz .

pointlander commented 8 years ago

&{ ... } is a predicate https://github.com/pointlander/peg/blob/1d0268dfff9bca9748dc9105a214ace2f5c594a8/peg.go#L1407 is the code generated for a predicate. Example: a <- &{ false } is a rule that will always fail. If you want to put code inline with peg you could use !{ ... }

On Sat, Aug 20, 2016 at 2:05 PM, Andrew Snodgrass pointlander@gmail.com wrote:

Some PEGs do support early code execution (http://piumarta.com/software/ peg/), but I don't think this is a good idea in general because PEGs do backtracking. Example: a <- { code } 'a' / 'b' with early code execution 'code' would be executed even if the next character isn't 'a'. You might want to use: https://github.com/go-yaml/yaml You could still use my implementation of peg by handling the significance of white space after parsing. You could call AST() and then do a transform on the ast as a function of white space.

On Sat, Aug 20, 2016 at 1:25 PM, wvxvw notifications@github.com wrote:

I was hoping to use this parser generator to parse YAML, but to do this, I need to depend on indenting while matching rules. Executing user code in Execute() thus is too late for me.

Is it possible, and if not, how difficult would it be to add functionality needed to do something that'd enable early code execution?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pointlander/peg/issues/53, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPZFBBFFoYZ7PtWaAkpNAmiWs0bPnTOks5qh1TFgaJpZM4JpJTz .

wvxvw commented 8 years ago

Re' go-yaml: it's a very problematic library. To put it simply: it doesn't work. It even fails to parse examples from the spec. It doesn't handle custom types. It has wrong interface for parsing custom literals etc. Meaning, not only it will be difficult to patch, but if I wanted to make it work properly, I'd need to make a change which is not compatible with the current interface.

Re' &{ ... } and !{ ... } - oh, thanks, yes. That's what I was looking for. I'm well aware of the consequences of backtracking in this case. Conceptually, if you wanted to take this even further, there are DCGs in Prolog, which solve this problem by automatically grounding and creating fresh versions of terms in case the predicate succeeds or fails. But I don't need that kind of machinery in my case. All I need is to count whitespace characters, so I'll never have to backtrack (or, at least it's possible to write a parser that never backtracks over indentation).

awalterschulze commented 8 years ago

@wvxvw I would really liked to see your completed parser. Since I would like to create a parser, in future, that satisfies this interface: http://katydid.github.io/parser/addingparsers.html This way I can have a yaml validator language.

wvxvw commented 8 years ago

@awalterschulze That will take time :) YAML is a surprisingly difficult language to parse. Whatever its creators had in mind it was definitely not simplicity or friendliness to parsring. But I'm intending to do this anyway.

awalterschulze commented 8 years ago

Ok I just skimmed the yaml spec. I think I will be preferring your PEG spec :) I am patient.