Allocation soundness issue

Dafny version

latest-nightly

Code to produce this issue

method Ooops()
  ensures false
{
  var x := new O();
  assert AllocationSoundness(x);
}

class O {
  var x: int
  constructor() {}
}

twostate predicate AllocationSoundness(o: O)
  ensures old(allocated(o))
{ true }

Command to run and resulting output

Paste in VSCode, this is verified.

What happened?

Twostate predicates can refer to non-allocated state. However, they assume everything is allocated, so we can extract this assumption and prove that objects in their old state were not allocated. Even if we did not have this assumption in twostate predicate, they could call predicates that would in turn assume allocatedness. Or lemmas that could assume allocatedness.

Really it all boils down to alloc being a field and assuming it in ghost contexts when the ghost context could actually refer to an object before it was allocated (through old@()), which is not possible for compiled contexts.

My advice is that that every time we call a twostate function or twostate lemma in a method, we need to check allocated() for all their arguments in the pre-state. In the example above, the proof obligation

assert old(allocated(x));

would have caught the issue upfront.

In the long-term, we might want to have twostate functions, ghost functions and lemmas that would not assume allocatedness in any way; however, they become a new color of functions/lemmas as they should not be able to call other ghost functions or lemmas that assume allocatedness without proving the allocatedness of all their arguments. Still, they would make it possible to implement the following currently impossible program in Dafny

class Context {
  var i: int
  method DoIt()
    modifies this
  method Monitor(test: Context ~> bool)
     requires test.requires(this) && test.reads(this) == {this}
     decreases *
     modifies this
     ensures test() // Currently error: function precondition could not be proved
  {
    DoIt();
    if !test(this) {
      Monitor(test);  // Currently error: function precondition could not be proved
      return;
    }
  }
}

function DoTest(c: Context) reads c requires true {
  c.i == 0
}

method Test(c: Context) {
  c.DoItAndCheck(DoTest);
}

To solve the above issue, we could introduce quantification over any heap:

     requires forallheap forall c: Context | allocated(c) :: test.requires(c) && test.reads(c) == {c}
     // forallheap can quantify over all heaps
     // which makes it possible to quantify over possibly yet unallocated references
     requires forall

Note that in the above case, if we removed the argument for the test, we would still be able to write this as this is always allocated at the beginning of methods

     requires forallheap test.requires() && test.reads() == {this}

and that would make our test to work

What type of operating system are you experiencing the problem on?

Windows

I think this issue could be solved with two things:

(minimal) Arguments of calls to twostate predicates should always be checked for allocatedness on the heap before. But that would prevent a lot of useful scenarios like being about to reason about multiple references being allocated or not.
Add an attribute like {:assume _allocated false} on ghost predicates and functions so that allocation is no longer assumed and object access needs to be guarded. Such functions can never call another function or lemma not marked with {:assume _allocated false}
- In this case, because allocatedness is state, we need a special reads clause saying our function depends on that state. Otherwise, the following would ensure false:
```
method Ooops()
ensures false
{
var o := new O();
assert WasNotAllocatedBefore(o);
label afterAlloc:
o.x := 1;
assert WasNotAllocatedBefore@afterAlloc(o); // No reads clause makes this function have the same value for both heaps!
// Contradiction
}
```

class O { var x: int constructor() {} }

{:assume_allocated false} twostate predicate WasNotAllocatedBefore(o: O) { !old(allocated(o)) }


Put together, we would have the following code that would be sound:

```dafny
method Finallyyyyy()
{
  var o := new O();
  assert WasNotAllocatedBefore(o);
  label afterAlloc:
  o.x := 1;
  assert WasNotAllocatedBefore@afterAlloc(o); // Error, because it reads the allocated bit which was modified for the first heap.
}

class O {
  var x: int
  constructor() {}
}

{:assume_allocated false}
twostate predicate WasNotAllocatedBefore(o: O)
  reads o`allocated // Need a reads clause like this
{ !old(allocated(o)) }

@MikaelMayer Thanks for the bug report. It is indeed a soundness issue.

The above analysis of the issue is not correct, however. Let me first comment on that, and also comment on the DoIt example. Then, I will describe the mysterious situation around the soundness issue.

Checking allocatedness at call sites

Dafny does check, at call sites, that non-new parameters of a twostate predicate are indeed allocated in the "previous state" on which the twostate predicate is invoked. But because of the soundness bug (which I'll describe later), this check is thwarted. Small changes to the example above show this. For example, if you remove the constructor from class O and correspondingly change new O(); to new O;, then you get:

test.dfy(14,36): Error: argument ('o') might not be allocated in the two-state function's previous state

As another example, if you add another parameter to the two-state predicate (like AllocationSoundness(n: int, o: O)) and pass in an argument (like 5) at the call site, then you also get the error about argument o not being allocated in the previous state.

In general, whenever a reference is dereferenced outside the current state, Dafny checks that the object really was allocated.

Functions that read the state

It is difficult to work with ~> functions, which read the state. Such functions seem to best used from methods that will pass them to another function just once, in which case one doesn't need to know what happens when the state changes. The DoIt example tries to call a ~> function in more than one state, and, as the example shows, that doesn't end well.

The DoIt example does reveal a completeness issue in Dafny. Beyond that, the example itself is on shaky grounds.

The completeness issue is that a call c.Monitor(DoTest); (where c is of type Context) fails to prove Monitor's precondition regarding the reads clause of Monitor's argument. This comes down to proving DoTest.requires(c) == {c} in the caller. Dafny doesn't manage to prove this. I didn't look into why, but this problem lies with Dafny.

The DoIt example as a whole is on shaky grounds, however. The problem is that there no reason for method Monitor to believe that a call to DoIt will not destroy the precondition of test. For illustration, I have removed Monitor's precondition about test.reads (since, as I just mentioned, that runs up again a completeness issue in Dafny). I also changed the body of function DoTest and added a body to method DoIt. Less importantly for the example, I also weakened the postcondition of Monitor.

class Context {
  var i: int

  method DoIt()
    modifies this
  {
    i := 16;
  }

  method Monitor(test: Context ~> bool)
     requires test.requires(this)
     modifies this
     ensures test.requires(this) ==> test(this)
     decreases *
  {
    DoIt();
    if !test(this) { // error: precondition violation
      Monitor(test);
      return;
    }
  }
}

function DoTest(c: Context): bool
  reads c
{
  c.i == 15
}

method Test(c: Context)
  modifies c
  decreases *
{
  c.i := 15;
  c.Monitor(DoTest);
}

In this example, Dafny correctly complains that calling test(this) after calling DoIt() may result in a violated precondition. Indeed, with my changes to DoTest and DoIt, there really is a problem.

So, I think the problem with this example application is that DoIt may indeed mess up the precondition of test. The solution, I think, is not to introduce a Dafny quantifier over all heaps, but to introduce yet one more arrow type, one that can read the heap, but that nevertheless is total. It is slightly unfortunate that Dafny's most general arrow type is written ~>, because, if we introduced that extra arrow type, then it would be nice if ~~> could take the place of the current ~> to mean "partial function that may read the heap", and let ~> mean "total function that may read the heap", in analogy to the current partial non-heap-reading --> and the current total non-heap-reading ->.

Back to the soundness issue detected by the Ooops example.

Root cause and fix

The root cause is that the Consequence Axiom for the two-state predicate AllocationSoundness is missing an antecedent that parameter o has to be allocated in the two-state predicate's "previous state". The quantifier in that axiom is currently generated as:

(forall $prevHeap: Heap, $Heap: Heap, n#0: int, o#0: ref :: 
    { _module.__default.AllocationSoundness($prevHeap, $Heap, n#0, o#0) } 
    _module.__default.AllocationSoundness#canCall($prevHeap, $Heap, n#0, o#0)
         || (1 < $FunctionContextHeight
           && 
          $IsGoodHeap($prevHeap)
           && $IsGoodHeap($Heap)
           && $HeapSucc($prevHeap, $Heap)
           && LitInt(0) <= n#0
           && $Is(o#0, Tclass._module.O()))
       ==> $IsAllocBox($Box(o#0), Tclass._module.O(), $prevHeap))

It says that, from rather general assumptions, and certainly no allocatedness assumptions about o#0, one can conclude that o#0 is allocated in the previous heap.

To fix the issue, the long part of the antecedent should also include

$IsAlloc(o#0, Tclass._module.O(), $prevHeap)

In fact, such an antecedent should be included for every non-new parameter of the two-state predicate, and the antecedent should also be used in the Definition Axioms for two-state predicates. Making those changes will fix the soundness bug.

Why so difficult to get bit by the bug?

The soundness bug is not easy to trigger. The Boogie encoding of the Dafny statement

assert AllocationSoundness(x);

is something like this (I'm leaving off some irrelevant parts):

    assert $IsAlloc(x#0, Tclass._module.O(), old($Heap)); // (*)
    ...
    assume AllocationSoundness#canCall(old($Heap), $Heap, x#0);
    assert {:subsumption 0}
        AllocationSoundness#canCall(old($Heap), $Heap, x#0)
         ==> AllocationSoundness(old($Heap), $Heap, x#0) || Lit(true);
    assume AllocationSoundness(old($Heap), $Heap, x#0);

The first assertion (which I marked with ()) is the one that checks the actual argument to be allocated in the state old($Heap). So, if the SMT solver is able to instantiate the unsound Consequence Axiom, then it can prove ().

We want all our axioms to be sound, even if matching patterns (triggers) are ignored. But since Z3 does use matching patterns, then how come it's able to use the Consequence Axiom whose trigger mentions AllocationSoundness(...) and the only information about AllocationSoundness(...) comes after assertion (*) in the Boogie program?

The first thing is that assertion (*) and the subsequent assumptions are in the same Boogie basic block. This goes against our mental model of Boogie programs (or at least my mental model of what that ought to be), but it has been reported before. Note, however, that the calCall assumption comes after (*), so Z3 will not be able to use it. This means that the only way Z3 can use the Consequence Axiom is by proving the "long" part of the antecedent.

The second mysterious thing is that the second assertion I show above matters. This by itself contains two mysteries. One is that, with the form ... ==> ... || true, the assertion is a tautology (well, except for that it needs to use the definition of Lit). So, it seems this line shouldn't matter. The other sub-mystery is that the assertion is marked with {:subsumption 0}, which is intended to say "check this property and then forget it". Apparently, this is not entirely forgotten, since this whole assertion plays a role in Z3 being able to discover the unsound axiom.

The fix

Again, the cause of the soundness bug is the missing antecedent about non-new parameters of the two-state function. This should be fixed in both the Consequence Axiom and the Definition Axioms.

The bug report also shows that, even if the appropriate conditions are checked at the call site, there may be a discrepancy in the corresponding axiom. The canCall mechanism was invented to try to increase performance. Unfortunately, we have never switched to relying solely on canCall, so the current encoding uses that disjunction of canCall and the "long" antecedent, which is probably worse for performance. This example shows that if we would do the extra work to remove the "long" antecedent altogether--and instead rely solely on canCall--then canCall can also help improve the soundness of our encoding. The reason for this is that then the only way to get to use a consequence/definition axiom is to have proved the appropriate things at the call site. That would eliminate the possibility that there could be a discrepancy between the call site and the axiom. (There could be other discrepancies, like between what's checked at the call site and what is assumed by the callee, but at least switching to antecedents with just canCall would eliminate one possible discrepancy--and one that mattered in this example.)

Wow, thanks for having written the time for such a detailed analysis.

For example, if you remove the constructor from class O and correspondingly change new O(); to new O;, then you get (...) As another example, if you add another parameter to the two-state predicate

Indeed, I see that the two variations show that the check was being performed. Good finding! You highlighted that the proof of false does not come from !old(allocated(o)) and old(allocated(o)), but from something else.

It is slightly unfortunate that Dafny's most general arrow type is written ~>, because, if we introduced that extra arrow type, then it would be nice if ~~>

I would have loved to have ~> mean total in my case, and the most general arrow function be ~~>. That would have solved the above problem. However, having these extra types seem to only defer the actual problem. How would you deal if the test itself was a partial function that reads the heap? There are many reasons why such a test would have preconditions, like there is a division:

class Context {
  var i: int

  method DoIt()
    modifies this
  {
    i := 16;
  }

  method Monitor(test: (Context, int) ~> bool)
     requires test.requires(this, 1000)
     modifies this
     ensures test.requires(this, 1000) ==> test(this)
     decreases *
  {
    DoIt();
    if !test(this, 1000) { // error: precondition violation
      Monitor(test);
      return;
    }
  }
}

function DoTest(c: Context, divider: int): bool
  reads c
  requires divider != 0
{
  c.i/divider == 15
}

method Test(c: Context)
  modifies c
  decreases *
{
  c.i := 15;
  c.Monitor(DoTest);
}

Above, the precondition of DoTest does not depend on the heap, so that this function is safe to call even if a method did modify the heap in between. I'm curious of your alternative of a forallheap which by the way does give the adequate expressibility and would not require to introduce the need of a new backward-incompatible arrow symbol.

To fix the issue, the long part of the antecedent should also include $IsAlloc(o#0, Tclass._module.O(), $prevHeap) (...) In fact, such an antecedent should be included for every non-new parameter of the two-state predicate,

I agree in all cases that we should add this requirement for every non-new parameter of the two-state predicate.

Root cause and fix

I see. So Z3 can use for triggers things that appear after an expression somehow. That makes sense since using this trigger makes it possible to prove the point.

I'd love to see a simpler version indeed with only cancall.

Extra comment

I would like us to think carefully of the following case. Mentally, in Dafny, every boolean expression can be refactored to a predicate or twostate predicate (except expressions referring to three different heaps). However, the postcondition of the method MaybeDuplicate below, although it does refer to only two heaps, cannot be refactored as-is, because any refactored predicate would require that the two objects be allocated in the previous state.

class O {
  var x: int
}
method MaybeDuplicate(o: O, create: bool) returns (o2: O)
  ensures !create ==> old(allocated(o2)) && old(o2.x) == o2.x
{
  if create {
    o2 := new O;
  } else {
    o2 := o;
  }
}

Of course, the fix is to strengthen the postcondition and state that !create ==> o2 == o so that we can get rid of the old(allocated(o2)), but I imagine we could have more complex scenarios.

A reply regarding the example in "Extra comment": Yes, you can extract that postcondition into a two-state predicate:

twostate predicate P(new o2: O)
  reads o2
{
  old(allocated(o2)) && old(o2.x) == o2.x
}

method MaybeDuplicate(o: O, create: bool) returns (o2: O)
  ensures !create ==> P(o2)

The new modifier on the parameter of P says that o2 only needs to be allocated in the "later" heap of the two-state predicate.

A reply regarding the example in "Extra comment": Yes, you can extract that postcondition into a two-state predicate:
twostate predicate P(new o2: O)
  reads o2
{
  old(allocated(o2)) && old(o2.x) == o2.x
}

method MaybeDuplicate(o: O, create: bool) returns (o2: O)
  ensures !create ==> P(o2)
The new modifier on the parameter of P says that o2 only needs to be allocated in the "later" heap of the two-state predicate.

Wow, that's great to know, thanks! I hadn't learn about "new" in parameters. Wonderful.

dafny-lang / dafny