BNFC / bnfc

BNF Converter
http://bnfc.digitalgrammars.com/
578 stars 163 forks source link

Coercions parentheses #156

Open gapag opened 8 years ago

gapag commented 8 years ago

The current syntax of coercions is coercions <category> <levels> e.g. coercions Exp 2

This is translated as a cascade of <level>+1 dummy-labeled productions, like e.g.

_. Exp0 ::= Exp1 ;
_. Exp1 ::= Exp2 ;
_. Exp2 ::= "(" Exp0 ")" ;

The introduction of the parentheses symbol might not be what the user wants. I propose to have an additional construct

coercions <category> <levels> <leftp> <rightp>

so that coercions Exp 2 "[" "]" becomes

_. Exp0 ::= Exp1 ;
_. Exp1 ::= Exp2 ;
_. Exp2 ::= "[" Exp0 "]" ;
gdetrez commented 8 years ago

This seems like a good idea. I wonder if we could go one step further in generalization and instead have a variant of coercions that skips the recursive production altogether. Something like

coercions Exp 2 norec;

which would then become only

_. Exp0 ::= Exp1 ;
_. Exp1 ::= Exp2 ;

and let the user define the parenthesized production (or not...). What do you think?

gapag commented 8 years ago

I think your solution is better than mine. If I am not wrong this would allow you even to define a grammar allowing more than one pair of brackets to promote lower priorities to higher ones, e.g. I could write

_. Exp2 ::= "[" Exp0 "]" ;
_. Exp2 ::= "«" Exp0 "»" ;

I guess one might also do weird combinations between priority levels -- if anyone will find any use for such a thing. (EDIT this is already possible, but the problem is that the pretty printer will anyway choose the round parentheses to print the concrete syntax) The only thing that leaves me unsatisfied is the fact that for saying adding the parenthesized production feels like a jump from a higher level of abstraction (enabled by the current coercions macro) to a lower one. But this is probably due to the fact that I got used to the current meaning of the macro -- which anyway in its syntax does not let you hint that parentheses will be introduced. This has been indirectly pointed out at least two times in the past on the mailing list

gapag commented 8 years ago

Just developing on the edit on the above comment, and thinking aloud. With this grammar, where I manually developed the coercions and forcibly set "[" and "]" as "promoting brackets",

Literal . Exp ::= "base" "+" Exp2;
LongLiteral    . Exp1 ::= "one" Integer;
IntegerLiteral . Exp2 ::= "two" ;
_. Exp ::= Exp1 ;
_. Exp1 ::= Exp2 ;
_. Exp2 ::= "[" Exp "]";

I got the following output, from inputting the string base + [ one 3242]

 $ java redefineInteger.Test
base + [ one 3242]

Parse Succesful!

[Abstract Syntax]

(Literal (LongLiteral 3242)) 

[Linearized Tree]

base + (one 3242)

Same output in Haskell.

gdetrez commented 8 years ago

Hum, that's a good observation. In the first case, where you have two different kind of brackets used for grouping expressions, it's going to be impossible to distinguish the two during pretty printing. But it won't be possible to distinguish them in the AST either, so they have to be semantically equivalent.

In the last example, we should definitively try to do the right thing!

gapag commented 8 years ago

Oh, I was just now looking at the thread too :) I think I can try to do something on this in the next month.