Open bobot opened 1 month ago
That's actually a known limitation of the SMT-LIB specification of well-foundedness for recursive datatypes, which I'd argue dolmen currently implements faithfully.
Here is the relevant part of the specification:
The datatype δ must be well founded in the following inductive sense: it must have a constructor of rank τ1 · · · τmδ such that τ1 · · · τm does not contain any of the datatypes from {δ1, . . . , δn} or, if it does contain some, they are well founded
This specification is relatively unclear about what happens in polymorphic situations (which is not that much of a surprise considering the recent and partial appearance of polymorphism in SMT-LIB2), where the notion of "does something contain a type" is not precise enough; currently it is interpreted by dolmen as any occurrence of the type, even as parameter of some other polymorphic type (e.g. list
). Clearly, this is a point that need to be clarified (all the more for SMT-LIB v3).
One problem here is that to check well-foundedness of such a declaration, one would have to expand the definition of list (because if one replaces list by another type, e.g. type 'a pair = 'a * 'a
, then the declaration is not well-founded anymore), and that kind of expansion might be in general costly to perform (and annoying to implement but that's another story). That being said, even with a manual expansion of the list type inside the datatype declaration (which I tested), the declaration is rejected by dolmen
, so there definitely can be some improvements in dolmen here, at least so that the typechecker is able to check that this is well-founded (for at least one reasonable definition of well-founded). Separately, the criterion used by the SMT-LIB should be clarified, and i'll create an issue on the SMT-LIb repo to raise the question.
Besides, we got the same issue while working on tests from Décysif project. I opened an issue on Why3 repository: https://gitlab.inria.fr/why3/why3/-/issues/860
With .psmt2
, dolmen is already going beyond .smt2
. So I propose to accept the same algebraic definitions as Why3 in this format. It is a conservative extension. At least Colibri2 would be able to work with such types, I believe Alt-Ergo too. Otherwise Why3 will have to use encodings for those types.
Do you have a description of the exact criterion that is used by why3 so that I can try and implement it ?
@bobot I posted the issue on the SMT-LIB repo. I think that one thing that could help move things a lot would be to have a description of a well-founded criterion that could replace/improve the current one.
I agree. I think the checked done by why3 are in this two places:
As discussed previously, I believe a (the?) good criterion is to extend the well-founded criterion to also include non-mutually-recursive parametric datatypes, because the well-foundedness of a parametric datatype can in general depend on the well-foundedness of its arguments. I think this is more or less what Why3 does — note the absence of a check for the mutually recursive definition here.
The following "counter-example" must be rejected:
(declare-datatypes ((Either 2)) ((par (A B) (
(inl (prl A))
(inr (prr B))
))))
(declare-datatypes ((Bogus 0)) ((
(mkBogus (bogus (Either Bogus Bogus)))
)))
Sorry, I got confused — I was thinking of this part of the SMT-LIB criterion:
The datatype δ must be well founded in the following inductive sense: it must have a constructor of rank τ1 · · · τmδ such that τ1 · · · τm does not contain any of the datatypes from {δ1, . . . , δn} or, if it does contain some, they are well founded
But this is not actually the one that causes the issue here — rather it is this one, which I believe is also the one that deals with positivity: "τj is a sort term that contains no occurrences of δ1, . . . , δn below its top symbol.". The following definition is forbidden by that criterion (interestingly it also makes Z3 segfault):
(declare-datatypes ((list 1) (tree 0)) ((par (alpha) (
(nil)
(cons (head alpha) (tail (list alpha)))
))
(
(treeqtmk (elem Int) (children (list tree)) (rank Int) )
)))
"τj is a sort term that contains no occurrences of δ1, . . . , δn below its top symbol."
Oh, I missed that part. For what it's worth, this specific criterion is not enforced by dolmen at the moment, and if polymorphism is to be fully supported by the SMT-LIB, this is - in my opinion - an extremely limiting criterion that should be removed.
@bclement-ocp , @bobot do not hesitate to also comment on the issue that I opened on the SMT-LIB repo : https://github.com/SMT-LIB/SMT-LIB-2/issues/32
Oh, I missed that part.
Then I'm confused. The above example (with mutually recursive list
and tree
) is rejected by Dolmen, and if it's not by that criterion, I don't understand why it is rejected.
Then I'm confused. The above example (with mutually recursive
list
andtree
) is rejected by Dolmen, and if it's not by that criterion, I don't understand why it is rejected.
Because of the criterion I quoted at the beginning of the thread:
The datatype δ must be well founded in the following inductive sense: it must have a constructor of rank τ1 · · · τmδ such that τ1 · · · τm does not contain any of the datatypes from {δ1, . . . , δn} or, if it does contain some, they are well founded
Since tree
only has a single constructor, the sort/type/rank of that constructor must respect the criterion above, i.e. that if it does contain some recursive datatypes, then they are well-formed. In the constructor of tree
, there are two occurrences of datatypes: one occurrence of list
, and one occurrence of tree
. The occurrence of list
is okay since it is well-formed (the nil
constructor respects the criterion), however, the occurrence of tree
does not respect that criterion, since we have not yet determined that tree
is well-formed (we are checking its only constructor precisely to check whether it is well-formed, and once list
is determined to be well-formed, the only thing left to do is check that tree
is well-formed).
As I mentioned in the issue on the SMT-LIb repo, this may be due to how one interprets the notion of "contain" in the criterion. Currently, dolmen interprets "contain" as any occurrence (even under type constructors).
Ah, yes. list
is not the issue, tree
is — got it. Thanks for the explanation.
From a file generated from an mlw by why3, the fact that list is well-founded is not taken into account: