Let-bindings in smtlib and equality

Gbury commented 6 years ago

While writing doc to resolve #1 , I happened upon an inconsistency between the smtlib and tptp parser.

In tptp, the syntax allows quantified equalities or equivalences in tff (see https://github.com/Gbury/dolmen/blob/master/src/languages/tptp/parseTptp.mly#L314) , and even arbitrary quantified formula in thf (see https://github.com/Gbury/dolmen/blob/master/src/languages/tptp/parseTptp.mly#L168). So typically, variable bindings in a let will have the form: App(Builtin(Eq), [Symbol x; bound_term]), or App(Builtin(Eq), [Colon(Symbol x, some_type); boud_term])
Smtlib is more restrictive and directly gives a list of pair of variables and terms. Currently this is parsed as Colon(Symbol x, bound_term).

It would be more uniform to use equalities for smtlib bindings, however this brings another problem: equality is not strictly part of the smtlib language (as of smtlib v2.5 at least), i.e. it --as well as usual logical connectives --

are just function symbols of the basic theory Core, implicitly included in all SMT-LIB theories

to quote the smtlib v2.5 manual. What this means is that an equality (= a b) will actually be parsed as: App(Symbol "=", [a; b]) rather than use the builtin equality symbol.

So, parsing bound variables as equalities (rather than colons) in smtlib would: 1) add a requirement to its term interface for an equality building function, 2) introduce in parsed AST from smtlib files, two distinct representation of equalities. Any thought ? cc @c-cube

Gbury commented 6 years ago

Another solution would be to parse the bound term equalities the same way as other arbitrary equalities, but I don't know whether it really is friendlier.

c-cube commented 6 years ago

you don't have to care about whether stuff is builtin or not, just parse equalities into your representation of equalities, not symbol applications.
let shouldn't take a list of terms anyway, but a list of (variable,term) or some similar notion of binding.

Gbury commented 6 years ago

I'd much prefer avoid introducing corner cases (such as specifically parsing the "=" string differently in smtlib) when they are not in the specification of a language syntax. This makes the parser much more clean, in sync with the official syntax definition, avoid introducing conflicts, and thus makes the parsers a lot easier to upgrade when new versions are released. This is why equality is not a builtin in the smtlib parser. Parsers should be an almost exact reflection of a language's official BNF syntax (which is why the .bnf file is included in the tptp language directory).

What happens with tptp's lets (which syntaxically accept any quantified formulas) in your example using a list of (variable,term) ? Also how do you specify the type of the variables (which is desirable in some cases, and should be faitfhully translated in the AST) ? You'd at least need something like a (variable, term option, term) list, plus the location of the variable so you get a (variable, location, term option, term) list) and at that point, I feel that it's not that much more complicated to simply have an arbitrary term, which already can effectively represent bindings, and differentiate bindings on terms and propositions (as in tptp), or allow more expressive lets (again, as in the tptp syntax).

Gbury commented 6 years ago

Admittedly most of my second point comes from my personal opinion to have a single AST represent all input languages, but that's why parsers are functors: so that everyone feel free to use her/his own AST.

In any case, I agree I need to change the smtlib parser interface, for instance so that the let building functions takes a list of tuples variable * term (but then the location of the variable is lost).

c-cube commented 6 years ago

Then there's no way in practice to have a single AST for multiple syntaxes, without reimplementing a significant part of the parser (handling of let bindings, say) after the fact because you have to guess the shape of the produced terms depending on the input syntax. I know you're locked with a single AST now, but that reinforces my opinion that there should be one AST per input format, precisely locked to its semantics…

Gbury commented 4 years ago

I'd say this is mostly resolved by the new typechecker. If anyone has a problem with this, don't hesitate to re-open the issue.

Gbury / dolmen

Let-bindings in smtlib and equality #2