MiniZinc / libminizinc

The MiniZinc compiler
http://www.minizinc.org
Other
516 stars 81 forks source link

4.1.14 Full grammar is not accurate #861

Open matsc-at-sics-se opened 2 weeks ago

matsc-at-sics-se commented 2 weeks ago

Section 4.1.14 of the Handbook, Full grammar, is not accurate. It may have been in sync with the actual grammar once, but certainly is not any more. Nonterminals, e.g., annotation-item, base-ti-expr-tail, are not accurate. That causes problems if you for some reason need to write your own parser.

I found a source file parser.yxx that appears to define the grammar that the toolchain actually uses. It would be great if the Handbook could be brought in sync with that, ideally automatically.

a1880 commented 2 weeks ago

I've used Bison 3.8.1 in the following script to extract a MiniZinc grammar:

set LIB=C:\CPPSRC\libminizinc-master\lib\
bison.exe --report=states --report-file=Parser_report.txt --output=Parser.cc %LIB%Parser.yxx

The resulting report file: Parser_report.txt

guidotack commented 2 weeks ago

The grammar in the handbook is hand-written in a simplified format, in order to have a cleaner document that's easier to read for humans. The bison parser.yxx needs to use some ugly workarounds for certain aspects of the MiniZinc syntax in order to make it LALR(1). It would therefore be difficult to generate the grammar in the documentation automatically from the parser.

I have fixed the annotation_item non-terminal (which was missing the optional = and right hand side). The base-ti-expr-tail appears to be equivalent to the parser from what I can see (it's structured differently, but I think it accepts the same language). Perhaps we'll have to do a manual audit at some point to make sure the parser and the grammar are indeed equivalent.

matsc-at-sics-se commented 2 weeks ago

The base-ti-expr-tail appears to be equivalent to the parser from what I can see (it's structured differently, but I think it accepts the same language).

The declaration of bar below would not be accepted by the manual's version of base-ti-expr-tail:

set of int: foo = {1, 2, 3};
var set of foo union {4}: bar;
guidotack commented 2 weeks ago

Ah, you're right. And now I also remember why the full grammar is so complicated. We have to distinguish, syntactically, between expressions in types and all other kinds of expressions, and I think the only way is to duplicate the entire expression part of the grammar.