Closed willfaught closed 8 years ago
Whoops, I missed it, my bad.
Hi Will,
No worries, and welcome to Gocc :)
Including an answer to this question from Marius (https://github.com/goccmack/gocc/issues/13#issuecomment-181827331):
gocc3 included two major changes:
- It supported an EBNF for input and
- It supported David Pager's "Practical General Method" for generating LR(1) parsers as an option to the Knuth LR(1) algorithm.
I am of the opinion that EBNF encourages grammars which are outside LR(1). Most EBNF grammars I wrote had lots of LR(1) conflicts. I would advise against adding EBNF to gocc again.
When I have the time again I would like to add Pager's PGM again because it generates smaller tables for full LR(1) grammars -- similar in size to LALR for the same grammar. I also think that the original paper is beautiful and that it is a pity that LALR became popular instead of this technique. The PGM parse table generator was working in gocc3 but not extensively tested.
I recalled gocc3 because of errors in the translation of EBNF to BNF for parser generation, which I didn't consider worth fixing because I peferred to revert to BNF input.
Cheers /u
@mewmew Thanks!
In the current version, I see lexer support for EBNF grouping (...)
, options [...]
, and repetition {...}
, but no exception/subtraction/difference a - b
(everything in a except that in b). Is there a way to do that with gocc? Apparently there's no way to do it just in terms of regular BNF like you can with those others listed above...?
I'm experimenting with the Haskell syntax, which uses exceptions.
I don't think there is a way to do difference. If I am correct, it is basically equivalent to a & !b, but gocc does not have AND.
Putting it in terms of & makes total sense, since CFLs aren't closed under intersection. Thanks.
Yes CFLs are not closed under intersection, but regular languages are. I might be wrong, but I think the lexer part of the BNF is using a regular language?
@awalterschulze I think being able to do something like _a: _b _c _b
makes it not regular, right? Productions in regular grammars are left/right linear, as in A -> xB
or A -> Bx
.
Looks like I was confused but partly right initially. There is sort-of-EBNF for lexers, but not for parsers. It would be convenient for parser grammars to at least have repetitions ({...}
), options ([...]
), and groupings ((...)
), which safely close over CFLs, to keep things simple and clean. An at-least-one (+
) repetition operator would be awesome too. Maybe <...>
... I understand the concern about lookahead tendencies, though, so I guess that's a tradeoff.
Yes trying to match brackets makes it "irregular". So I am wonder if the ! in the lexer makes it irregular, since the ! is not a pure implementation, but rather an implementation that is user friendly.
I think @goccmack tried to add an EBNF syntax and that would have included {} and [] etc, but he said that this makes it hard to write LR(1) grammars and the user tends to cause more conflicts. Also in my opinion this makes it harder to write SDT rules.
Any plans for EBNF support?