mna / pigeon

Command pigeon generates parsers in Go from a PEG grammar.
BSD 3-Clause "New" or "Revised" License
822 stars 66 forks source link

Add a syntax to support embed custom parser code during parsing (a proposal with demo code) #146

Open fy0 opened 2 months ago

fy0 commented 2 months ago

Sometimes, i want to use a hand writing parser for some exprerssion, for example:

e <- '{' myExpr '}'

myExpr <- {
   // some code to make brackets certainly matching, accept text like "{ {} }"
}

For that, i made a fork. Owe to pigeon's clear design, it's much easier than i thought it should be.

I use sign "*{}" to describle custom parser code block:

{
package main
}

e <- "begin" x "end" !.

myExpr <- '1' / '2' / '3'

x <-  *{
    start := p.pt

    arr := []rune{}
    for i := 0 ; i < 3; i+=1  {
        cur := p.pt.rn
        arr = append(arr, cur)
        p.read()
    }

    r, ok := p.parseExpr(p.rules["myExpr"].expr)

    p.failAt(true, start.position, "3 char + myExpr")
    return string(arr) + string(r.([]uint8)), ok, nil
}

test code:

package main

import "fmt"

func main() {
    fmt.Println(Parse("", []byte("beginxxx1end")))
}
> go run .
[[98 101 103 105 110] xxx1 [101 110 100] <nil>] <nil>

But if i want to control the parser by manual, expose "parser" object to action is necessary. I noticed parser object is willful hide for all code blocks. So i'm worried about i had break a guideline.

Have a look and have a nice day.

breml commented 2 months ago

Hi @fy0 Thanks for your pull request. I would like to better understand in what kind of situations you want to control the parser manually. Can you elaborate and provide some real world examples?

fy0 commented 2 months ago

Hi @fy0 Thanks for your pull request. I would like to better understand in what kind of situations you want to control the parser manually. Can you elaborate and provide some real world examples?

For example, I want to embed another parser inside my peg parser.

{
package main
}

e <- "<%js" js_code "%>" !.

js_code <- *{
    // there is a js parser, eat unknow length text, return a object
}

If do this work with current pigeon, I need to describe all syntax by peg, or just match '%>' and pray for '%>' is not appears in js code part.

It's also useful to implement meta programming language feature. Another example, i want user can register their own operators. pigeon's #{ syntax is nearly works, but not good.