boogie-org / boogie

Boogie
http://research.microsoft.com/en-us/projects/boogie/
MIT License
504 stars 110 forks source link

Inconsistency in predicate-based map axioms when using MBQI #735

Closed atomb closed 1 year ago

atomb commented 1 year ago

While investigating this Dafny issue, I discovered that Boogie seems to be generating map axioms when using the predicate-based encoding for polymorphic types that are inconsistent when MBQI (which is disabled by default) is enabled in the solver. Take the following Boogie program, distilled from the original Dafny example.

type ref;
type Field _;
type Heap = [ref]<alpha>[Field alpha]alpha;
function $IsGoodHeap(Heap) : bool;
function $IsHeapAnchor(Heap) : bool;
var $Heap: Heap where $IsGoodHeap($Heap) && $IsHeapAnchor($Heap);

procedure P()
{
    assert false;
}

When running with the various type encoding flags, and MBQI enabled, I get the following.

$ boogie issue-4053-pruned-prefix.bpl /proverOpt:O:smt.mbqi=true /typeEncoding:p

Boogie program verifier finished with 1 verified, 0 errors
$ boogie issue-4053-pruned-prefix.bpl /proverOpt:O:smt.mbqi=true /typeEncoding:a
issue-4053-pruned-prefix.bpl(10,5): Error: this assertion could not be proved
Execution trace:
    issue-4053-pruned-prefix.bpl(10,5): anon0

Boogie program verifier finished with 0 verified, 1 error
$ boogie issue-4053-pruned-prefix.bpl /proverOpt:O:smt.mbqi=true /typeEncoding:m
issue-4053-pruned-prefix.bpl(10,5): Error: this assertion could not be proved
Execution trace:
    issue-4053-pruned-prefix.bpl(10,5): anon0

Boogie program verifier finished with 0 verified, 1 error
$

Is there some reason it should be expected that these axioms are incompatible with MBQI? And is that why it's disabled by defalut?

Running various solvers on the generated SMT-Lib, several versions of Z3, CVC5, and Vampire all report unsat. This happens when enabling MBQI in Z3 or CVC5, and when using Vampire with its default flags. Using Z3 or CVC5 with MBQI disabled results in unknown.

atomb commented 1 year ago

The comment from @RustanLeino here suggests how to fix the issue.

shazqadeer commented 1 year ago

Is there some reason it should be expected that these axioms are incompatible with MBQI?

I read the explanation by @RustanLeino. It sounds like the root cause is an inconsistency bug in the type encoding implemented in Boogie. The inconsistency remained hidden because of the incompleteness of quantifier reasoning in typical SMT solvers. Turning MBQI on reduces incompleteness in some examples and revealed the inconsistency.

And is that why it's disabled by default?

No. Z3 has many users. It's world does not revolve around Boogie and Dafny, as no doubt @NikolajBjorner will remind you. I suspect that the reason MBQI is turned off by default is that it is expensive.

shazqadeer commented 1 year ago

The comment from @RustanLeino here suggests how to fix the issue.

Great! I am glad a fix has been discovered. I encourage the Dafny community to put up a PR so we can land the fix pronto. I also encourage the inventors of Boogie type encoding to think about whether other similar bugs are lurking in there. An old-fashioned method is a proof of soundness.

atomb commented 1 year ago

Oh, I meant that Boogie disables it by default. Z3 doesn't.

I don't remember any significant discussion about whether it should be on or off by default. I have found MBQI to be expensive without yielding much in completeness for my Boogie applications. So I have not resisted turning it off by default in Boogie.

In general though, I have tried to move Boogie away from heavy customization of command-line options for SMT solvers, instead letting Boogie users drive such customization.

atomb commented 1 year ago

The comment from @RustanLeino here suggests how to fix the issue.

Great! I am glad a fix has been discovered. I encourage the Dafny community to put up a PR so we can land the fix pronto. I also encourage the inventors of Boogie type encoding to think about whether other similar bugs are lurking in there. An old-fashioned method is a proof of soundness.

I'm happy to take a stab at implementing @RustanLeino's suggested fix, and may be able to do some deeper soundness investigations a bit later.

RustanLeino commented 1 year ago

Regarding Z3's default, here's my understanding: By default, Z3's auto-config mode in on. In the auto-config mode, Z3 scrutinizes its input and determines what quantifier mode to use. In some cases, that mode may end up being MBQI. If you manually turn auto-config off, then Z3 defaults to MBQI. So, if you really don't want MBQI, you have to turn off both auto-config and smt.mbqi. Boogie does that.

RustanLeino commented 1 year ago

I also encourage the inventors of Boogie type encoding to think about whether other similar bugs are lurking in there. An old-fashioned method is a proof of soundness.

Hehe. Of the two things I mentioned here, the missing parameter(s) of the type constructor seems like a coding bug. (I didn't study the Boogie code just now.) @pruemmer and my paper has all the parameters of Ctor (see section 2.0 of https://leino.science/papers/krml199.pdf).

The missing antecedent is a logic bug. Looking at the "Function axiom" paragraph of Section 3.0 of the paper, the bug seems to be that an \alpha could occur more than once among s_1, ..., s_n. Formula (0) fails to capture any correspondence between multiple occurrences of such an \alpha, which is the problem here. That is, the TypeMap2Store function should have signature

function TypeMap2Store<A, B>(m: [A]B, a: A, b: B): [A]B;

(Here, I declare TypeMap2Store using Boogie syntax, even though this particular function is one that is built-in in Boogie. And, for clarify, I wrote A and B for instead of \alpha_1 and \alpha_2.)

NikolajBjorner commented 1 year ago

Ehm, z3 may not revolve around Boogie, but my world certainly revolves around Stan U R.

NikolajBjorner commented 1 year ago

s that are inconsistent when MBQI (which is disabled by default) is enabled in the solve

oh that one. We hit this when trying Vampire many years ago.

NikolajBjorner commented 1 year ago

An old-fashioned method is a proof of soundness.

you mean?

shazqadeer commented 1 year ago

There is another type encoding method that is triggered by -typeEncoding:arguments. Is it possible that this bug is lurking there also?

If -typeEncoding:arguments is not used by anybody, should be consider deleting it?

atomb commented 1 year ago

I have not been able to replicate this issue with -typeEncoding:arguments, and in fact I've been considering making that encoding the default for Dafny in the near term because it seems to consistently perform better (except when using first-order provers like Vampire, of course, for which we'll need p or m). In the longer term, Dafny might choose between a and m, depending on the structure of the program.

The bottom line is that I think all the encoding schemes are worth keeping for now, unless we can be sure that we can translate any Boogie program to a monomorphic encoding.

gauravpartha commented 1 year ago

As a side remark, our CAV 2021 paper (https://link.springer.com/chapter/10.1007/978-3-030-81688-9_33) generates proofs for a subset of polymorphic Boogie where -typeEncoding:predicates is used (Viper also uses -typeEncoding:predicates). The subset does not cover maps, but for the basic subset including type quantification and polymorphic functions the paper provides confidence that -typeEncoding:predicates is sound (note that the paper does not give a once-and-for-all proof but generates proofs for successfuls runs of Boogie for the concrete input programs). One could use this work as a starting point to increase confidence of the map encoding.

Regarding typeEncoding:arguments: I don't know the encoding in detail, but if I remember correctly, the encoding relies quite heavily on triggers. I'm not sure if the soundness also relies on triggers. If the soundness relies on triggers, then proving soundness of -typeEncoding:argument seems hard, because the Boogie semantics would have to depend on triggers and as far as I know there has not been much work on such a semantics (in our paper, triggers do not affect the semantics).

Just playing around a bit with typeEncoding:arguments I found one example, where typeEncoding:arguments seems to have a bug (without using maps):

type Foo _;
type Bar _;

function elem<T>() : Foo T;

procedure p() {
    var y1: Bar int, y2: Bar int;

    //every type Foo T consists of exactly 1 element
    assume (forall <T> x: Foo T :: (x == elem()));

    //Bar int consists of at least two elements
    assume y1 != y2;

    /* we should not be able to prove false (there exists a universe that satisfies both assumptions) */
    assert false; 
}

The type quantifier in the SMTLIB2 file is given by: forall ((x@@8 T@U) (T T@T) ) (! (= x@@8 (elem T), which is essentially saying that every type has only one element (it ignores the Foo type constructor). As a result, in theory the assert false at the end could be proved, because there is a universe where Bar int has at least two inhabitants (in practice Boogie cannot prove it). I am not sure if this is a bug or if I am misinterpreting something.

atomb commented 1 year ago

One relevant snippet from the original paper:

It has to be noted that this second translation [the argument-based translation] crucially depends on the usage of an SMT solver with e-matching: such solvers are not able to exploit missing type guards, because typing information is inserted in expressions in such a way that triggers can only match on expressions of the right type. The translation trades generality for performance: while it is not applicable with most first-order theorem provers (e.g., superposition provers), the experimental evaluation in Section 4 shows a clear performance gain compared to the type guard translation from the previous section.

To me this sounds like a completeness question rather than a soundness one.

NikolajBjorner commented 1 year ago

Is there a model theory of a logic with triggers?

gauravpartha commented 1 year ago

Is there a model theory of a logic with triggers?

I have not explored the formal details of triggers much, but there is the paper "Reasoning with triggers (Claire Dross, Sylvain Conchon, Johannes Kanig, and Andrei Paskevich; SMT@IJCAR 2012)", which defines a semantics that takes triggers into account.

shazqadeer commented 1 year ago

@atomb : Can I close this issue?

atomb commented 1 year ago

Oh, yes. I meant to do it in the process of merging that PR. It includes only one of the two changes @RustanLeino suggested, but I think that's enough to fix the inconsistency.

shazqadeer commented 1 year ago

@atomb : If you intend to implement the other change and wish to track its completion using an issue, you are welcome to reopen this issue or create another issue.

atomb commented 1 year ago

I think a different issue makes sense. I'll create one.