Matched text not accessible from state blocks

mna / pigeon

Command pigeon generates parsers in Go from a PEG grammar.

BSD 3-Clause "New" or "Revised" License

822 stars 66 forks source link

Matched text not accessible from state blocks #105

Open vincentbernat opened 1 year ago

vincentbernat commented 1 year ago

Hey!

When inside state blocks, c.text does not seem to be what's expected (often empty, sometimes something parsed previously). Would it be easy to "fix"?

Thanks.

breml commented 1 year ago

Hi @vincentbernat

Thanks for reaching out. Can you please provide some more details, in the best case some minimal examples together with the information, what you expect to see and what you actually see. Then we might be able to work from there.

vincentbernat commented 1 year ago

Here is a minimal reproducer:

{
package issue105
}

X ← Y { return c.state["here"].(string) + c.state["hello"].(string), nil }
Y ← "hello" #{ c.state["hello"] = string(c.text); c.state["here"] = "1:"; return nil }

package issue105

import (
    "testing"
)

func TestTextFromStateBlock(t *testing.T) {
    out, err := Parse("", []byte("hello"))
    if err != nil {
        t.Fatal(err)
    }
    if out != "1:hello" {
        t.Fatalf("expected %q, got %q", "1:hello", out)
    }
}

The test fails with:

 17:33 ❱ go test ./test/issue_105
--- FAIL: TestOptimizeGrammar (0.00s)
    issue_105_test.go:13: expected "1:hello", got "1:"
FAIL
FAIL    github.com/mna/pigeon/test/issue_105    0.001s
FAIL

I expect to be able to get hello as c.text in the state block.

breml commented 1 year ago

@vincentbernat Thanks for the minimal reproducer.

It has been a while, since I last touched this code, so I needed to dig into it in a little bit more detail.

The way, the state change blocks are currently designed, it is not possible to provide the valid content for c.text. The reason for this is, that the state change blocks (in contrast to action code blocks) are not bound to a PEG expression. In that sense, the state change code blocks work the same as the predicate code blocks. For the predicate code blocks, the documentation says:

It is empty in a predicate code block.

The same applies for the state change code blocks, but it looks like, this limitation is not yet reflected in the documentation.

vincentbernat commented 1 year ago

If this is not trivial, I can live without this feature (it just makes my code a bit more repetitive). I don't have time myself to try to fix this.

Thanks!