List entrypoints reverse parsed lists (Haskell/C)

BNFC / bnfc

BNF Converter

http://bnfc.digitalgrammars.com/

588 stars 165 forks source link

List entrypoints reverse parsed lists (Haskell/C) #163

Closed gapag closed 5 years ago

gapag commented 8 years ago

BNFC release 2.8, and up to the latest commit. The Java backend is sensitive to the order of the macros/productions. For instance the following gives errors, both using CUP and ANTLR4:

separator Exp ",";
List . Something ::=  [Exp];
A . Exp ::= "a";
B . Exp ::= "b";

Instead

List . Something ::=  [Exp];
separator Exp ",";
A . Exp ::= "a";
B . Exp ::= "b";

seems to work fine. I did not try all the other backends, but Haskell seems to work without problem with both formulations.

gdetrez commented 8 years ago

I think the problem is not so much the order of the rules but the entry points (wich, if unspecified, will be the first category). Many backends don't seem to support having a list as an entry point.

andreasabel commented 6 years ago

Maybe entry points could be handled in the general layer rather in the backends.

andreasabel commented 6 years ago

In the generated cup file I find:

start with [Exp];

ListExp ::= ...

It seems that the start non-terminals are not translated.

andreasabel commented 5 years ago

The botched entrypoint in the cup file is easily fixed. (Use identCat instead of show.) For the OCaml backend, just the generated Test program was faulty.

The Haskell and C backends print a parsed top-level list backwards. (Not an issue with Java and CPP.) The reason is that e.g. Haskell generates left-recursive Happy rules for lists, generating snoc-lists. These are reversed when plugged into other AST nodes, but not at the top-level. A more principled solution may be to have rules for snoc-lists and then rules for the corresponding cons-lists that just apply the reversal function.

andreasabel commented 5 years ago

Reversed list printing in Haskell with this test case:

terminator Exp "";
Lit. Exp ::= Integer ;

I could not reproduce the problem in the simpler setting:

terminator Integer "";

It seems that then the right-to-left recursion transformation does not kick in.

UPDATE: fixed for Haskell by removing the right-to-left recursion transformation, but still open for C.

andreasabel commented 5 years ago

That left recursion for LR parsing is strictly more performant than right recursion is not a categorical truth. It is true that left recursion uses O(1) stack and right recursion O(n). However, the AST/parsetree stored in the single cell in case of left recursion is of size O(n) where each stack entry in case of right recursion has size O(1). So it is O(n) no matter which one is uses.

Happy uses a heap-allocated parser stack and BNFC-generated parsers produce ASTs, thus, left recursion does not save anything in comparison to right recursion.

Bison has hard stack limits (10000 if not set otherwise), thus, left-recursion may be safer. For instance, the bnfc -c generated bison parser for input separator [Integer] ""; reports

error: 9999,0: memory exhausted at 9999

We might thus remove the optimization for the Haskell backend, yet keep it for C.

See also section "A few words on performance" of blog http://gallium.inria.fr/blog/lr-lists/

andreasabel commented 5 years ago

C++ gets this list reversed:

separator Integer "";

andreasabel commented 5 years ago

ANTLR seems to have a bug blocking this issue for the Java/ANTLR backend: https://github.com/antlr/antlr4/issues/2689 The .g4 parser specification produced by BNFC looks right.

andreasabel commented 4 years ago

Since #272 is fixed this issue should be resolved completely.