Closed atomb closed 1 year ago
Is there some reason it should be expected that these axioms are incompatible with MBQI?
I read the explanation by @RustanLeino. It sounds like the root cause is an inconsistency bug in the type encoding implemented in Boogie. The inconsistency remained hidden because of the incompleteness of quantifier reasoning in typical SMT solvers. Turning MBQI on reduces incompleteness in some examples and revealed the inconsistency.
And is that why it's disabled by default?
No. Z3 has many users. It's world does not revolve around Boogie and Dafny, as no doubt @NikolajBjorner will remind you. I suspect that the reason MBQI is turned off by default is that it is expensive.
The comment from @RustanLeino here suggests how to fix the issue.
Great! I am glad a fix has been discovered. I encourage the Dafny community to put up a PR so we can land the fix pronto. I also encourage the inventors of Boogie type encoding to think about whether other similar bugs are lurking in there. An old-fashioned method is a proof of soundness.
Oh, I meant that Boogie disables it by default. Z3 doesn't.
I don't remember any significant discussion about whether it should be on or off by default. I have found MBQI to be expensive without yielding much in completeness for my Boogie applications. So I have not resisted turning it off by default in Boogie.
In general though, I have tried to move Boogie away from heavy customization of command-line options for SMT solvers, instead letting Boogie users drive such customization.
The comment from @RustanLeino here suggests how to fix the issue.
Great! I am glad a fix has been discovered. I encourage the Dafny community to put up a PR so we can land the fix pronto. I also encourage the inventors of Boogie type encoding to think about whether other similar bugs are lurking in there. An old-fashioned method is a proof of soundness.
I'm happy to take a stab at implementing @RustanLeino's suggested fix, and may be able to do some deeper soundness investigations a bit later.
Regarding Z3's default, here's my understanding: By default, Z3's auto-config
mode in on. In the auto-config mode, Z3 scrutinizes its input and determines what quantifier mode to use. In some cases, that mode may end up being MBQI. If you manually turn auto-config
off, then Z3 defaults to MBQI. So, if you really don't want MBQI, you have to turn off both auto-config
and smt.mbqi
. Boogie does that.
I also encourage the inventors of Boogie type encoding to think about whether other similar bugs are lurking in there. An old-fashioned method is a proof of soundness.
Hehe. Of the two things I mentioned here, the missing parameter(s) of the type constructor seems like a coding bug. (I didn't study the Boogie code just now.) @pruemmer and my paper has all the parameters of Ctor
(see section 2.0 of https://leino.science/papers/krml199.pdf).
The missing antecedent is a logic bug. Looking at the "Function axiom" paragraph of Section 3.0 of the paper, the bug seems to be that an \alpha
could occur more than once among s_1, ..., s_n
. Formula (0) fails to capture any correspondence between multiple occurrences of such an \alpha
, which is the problem here. That is, the TypeMap2Store
function should have signature
function TypeMap2Store<A, B>(m: [A]B, a: A, b: B): [A]B;
(Here, I declare TypeMap2Store
using Boogie syntax, even though this particular function is one that is built-in in Boogie. And, for clarify, I wrote A
and B
for instead of \alpha_1
and \alpha_2
.)
Ehm, z3 may not revolve around Boogie, but my world certainly revolves around Stan U R.
s that are inconsistent when MBQI (which is disabled by default) is enabled in the solve
oh that one. We hit this when trying Vampire many years ago.
An old-fashioned method is a proof of soundness.
There is another type encoding method that is triggered by -typeEncoding:arguments. Is it possible that this bug is lurking there also?
If -typeEncoding:arguments is not used by anybody, should be consider deleting it?
I have not been able to replicate this issue with -typeEncoding:arguments
, and in fact I've been considering making that encoding the default for Dafny in the near term because it seems to consistently perform better (except when using first-order provers like Vampire, of course, for which we'll need p
or m
). In the longer term, Dafny might choose between a
and m
, depending on the structure of the program.
The bottom line is that I think all the encoding schemes are worth keeping for now, unless we can be sure that we can translate any Boogie program to a monomorphic encoding.
As a side remark, our CAV 2021 paper (https://link.springer.com/chapter/10.1007/978-3-030-81688-9_33) generates proofs for a subset of polymorphic Boogie where -typeEncoding:predicates
is used (Viper also uses -typeEncoding:predicates
). The subset does not cover maps, but for the basic subset including type quantification and polymorphic functions the paper provides confidence that -typeEncoding:predicates
is sound (note that the paper does not give a once-and-for-all proof but generates proofs for successfuls runs of Boogie for the concrete input programs). One could use this work as a starting point to increase confidence of the map encoding.
Regarding typeEncoding:arguments
: I don't know the encoding in detail, but if I remember correctly, the encoding relies quite heavily on triggers. I'm not sure if the soundness also relies on triggers. If the soundness relies on triggers, then proving soundness of -typeEncoding:argument
seems hard, because the Boogie semantics would have to depend on triggers and as far as I know there has not been much work on such a semantics (in our paper, triggers do not affect the semantics).
Just playing around a bit with typeEncoding:arguments
I found one example, where typeEncoding:arguments
seems to have a bug (without using maps):
type Foo _;
type Bar _;
function elem<T>() : Foo T;
procedure p() {
var y1: Bar int, y2: Bar int;
//every type Foo T consists of exactly 1 element
assume (forall <T> x: Foo T :: (x == elem()));
//Bar int consists of at least two elements
assume y1 != y2;
/* we should not be able to prove false (there exists a universe that satisfies both assumptions) */
assert false;
}
The type quantifier in the SMTLIB2 file is given by: forall ((x@@8 T@U) (T T@T) ) (! (= x@@8 (elem T)
, which is essentially saying that every type has only one element (it ignores the Foo
type constructor). As a result, in theory the assert false
at the end could be proved, because there is a universe where Bar int
has at least two inhabitants (in practice Boogie cannot prove it). I am not sure if this is a bug or if I am misinterpreting something.
One relevant snippet from the original paper:
It has to be noted that this second translation [the argument-based translation] crucially depends on the usage of an SMT solver with e-matching: such solvers are not able to exploit missing type guards, because typing information is inserted in expressions in such a way that triggers can only match on expressions of the right type. The translation trades generality for performance: while it is not applicable with most first-order theorem provers (e.g., superposition provers), the experimental evaluation in Section 4 shows a clear performance gain compared to the type guard translation from the previous section.
To me this sounds like a completeness question rather than a soundness one.
Is there a model theory of a logic with triggers?
Is there a model theory of a logic with triggers?
I have not explored the formal details of triggers much, but there is the paper "Reasoning with triggers (Claire Dross, Sylvain Conchon, Johannes Kanig, and Andrei Paskevich; SMT@IJCAR 2012)", which defines a semantics that takes triggers into account.
@atomb : Can I close this issue?
Oh, yes. I meant to do it in the process of merging that PR. It includes only one of the two changes @RustanLeino suggested, but I think that's enough to fix the inconsistency.
@atomb : If you intend to implement the other change and wish to track its completion using an issue, you are welcome to reopen this issue or create another issue.
I think a different issue makes sense. I'll create one.
While investigating this Dafny issue, I discovered that Boogie seems to be generating map axioms when using the predicate-based encoding for polymorphic types that are inconsistent when MBQI (which is disabled by default) is enabled in the solver. Take the following Boogie program, distilled from the original Dafny example.
When running with the various type encoding flags, and MBQI enabled, I get the following.
Is there some reason it should be expected that these axioms are incompatible with MBQI? And is that why it's disabled by defalut?
Running various solvers on the generated SMT-Lib, several versions of Z3, CVC5, and Vampire all report
unsat
. This happens when enabling MBQI in Z3 or CVC5, and when using Vampire with its default flags. Using Z3 or CVC5 with MBQI disabled results inunknown
.