Open hanno-becker opened 1 week ago
@remi-delmas-3000 @tautschnig Do you have ideas here?
Hi, I don't think its a problem with framing per se (as in checking that side effects are included in the assigns clauses, checking assigns clause inclusion, etc.). What seems to be happening is that solvers struggle with quantified proof obligations.
I encoded the contract transformation manually (to get rid of frame condition checking) which gives the following straight-line code:
#include <assert.h>
#include <stdint.h>
#include <stdlib.h>
#define A 12
#define B 42
#define SZ 256
void harness(void)
{
int len, start, sz;
__CPROVER_assume(2 <= len && len <= SZ && (len & 1) == 0);
__CPROVER_assume(0 <= start && start < len);
__CPROVER_assume(1 <= sz && sz <= len / 2 && start + 2 * sz <= len);
int8_t *x = malloc(len * sizeof(int8_t));
__CPROVER_assume(x != NULL);
__CPROVER_assume(__CPROVER_forall {
int i;
0 <= i && i < start ==> x[i] <= B
});
__CPROVER_assume(__CPROVER_forall {
int i;
start <= i && i < len ==> x[i] <= A
});
/// inline call to bump_A_to_B(x + start, sz);
int8_t *call_arg1 = x + start;
int call_arg2 = sz;
{
int8_t *x = call_arg1;
int sz = call_arg2;
__CPROVER_assert(
__CPROVER_forall {
int i;
0 <= i && i < 2 * sz ==> x[i] <= A
},
"precondition replaced call");
__CPROVER_havoc_slice(x, 2 * sz);
__CPROVER_assume(__CPROVER_forall {
int i;
0 <= i && i < 2 * sz ==> x[i] <= B
});
}
#ifdef POST1
__CPROVER_assert(
__CPROVER_forall {
int i;
0 <= i && i< start + 2 * sz ==> x[i] <= B
},
"post condition 1");
#endif
#ifdef POST2
__CPROVER_assert(
__CPROVER_forall {
int i;
start + 2 * sz <= i && i<len ==> x[i] <= A
},
"post condition 2");
#endif
The I analyzed POST1 and POST2 separately and together, got these results:
| | z3 | bw |
| ------------ | ------------ | ---------- | ---------- |
| no contracts | post 1 only | 42m41.954s | 33m2.139s |
| | post 2 only | 3m1.328s | 0m14.932s |
| | post 1+2 | 25m34.864s | 17m30.813s |
| ------------ | ------------ | ---------- | ---------- |
| contracts | post 1 only | 19m14.907s | 19m40.051s |
| | post 2 only | 0m22.193s | 2m30.510s |
| | post 1+2 | 24m3.121s | 26m26.308s |
So using contracts vs the straigh-line code that does not do frame condition checking does not matter that much on the overall runtime.
What we see is that solvers struggle to prove the first post condition that says that x[i] <= B
holds in the extended range [0, start + 2 *sz)
;
[0, start)
you have to prove that havoc_slice
did not change the values in the array and then apply the precondition over range [0, start)
[start, start + 2*sz)
you have to use the post condition of bump_A_to_B
which tells you it holds on the havoced interval [0, 2*sz)
of the subarray that starts at offset start
in x
.If we split all the pre and post conditions over ranges [0, start)
, [start, start + 2 * sz)
and [start + 2 * sz, len)
we should make the proof easier because on [0, start)
and [start + 2 * sz, len)
that are not havoced the postconditions stem directly form the preconditions. And indeed that's what we see.
| | z3 | bw |
| ------------ | ------------ | ---------- | ---------- |
| no contracts | post 1 only | 0m16.543s | 2m29.401s |
| | post 2 only | 10m31.124s| 32m26.411s |
| | post 3 only | 0m15.207s | 2m11.520s |
| | all | 15m24.019s| 34m1.366s |
| ------------ | ------------ | ---------- | ---------- |
| contracts | post 1 only | 0m40.772s | 2m11.435s |
| | post 2 only | 20m37.789s| 14m32.638s |
| | post 3 only | 0m26.735s | 2m27.044s |
| | all | 26m12.638s| 19m34.666s |
| ------------ | ------------ | ----------| ---------- |
We see that POST1 on [0, start)
and POST3 on [start + 2 * sz, len)
are solved quickly, and POST 2 on [start, start + 2 * sz)
is hard to solve.
CBMC encodes bounded arrays in SMT as a pair of unbounded array and a length.
In this program we have two versions of array x
, x0
and x1
that represent the pre- and post- havoc_slice
states.
(Slightly simplified from the actual encoding)
(declare-const x0 (Array (_ BitVec 64) (_ BitVec 8)))
(declare-const x0_len (_ BitVec 64))
(declare-const x1 (Array (_ BitVec 64) (_ BitVec 8)))
(declare-const x1_len (_ BitVec 64))
The havoc_slice operation uses an intermediate nondeterministic array and a forall constraint to define x1
as equal to x0 in the range [0, start)
or [start + 2 * sz, x0_len)
(declare-const nondet_bytes (Array (_ BitVec 64) (_ BitVec 8)))
(declare-const nondet_bytes_len (_ BitVec 64))
(assert (= nondet_bytes_len (* 2 sz)))
;; havoc slice post condition
;; forall idx,
;; 0 <= idx && idx < x1_len =>
;; x1[idx] = (idx < start || idx >= start + 2*sz ) ? x0[idx] : nondet_bytes[idx - start]
(assert
(forall ((idx (_ BitVec 64)))
(=>
(and
(bvule (_ bv0 64) idx)
(bvult idx x1_len))
(=
(select x1 idx)
(ite
(or
(not (bvsge idx ((_ zero_extend 8) start)))
(bvuge idx
(bvadd
((_ zero_extend 8) start)
((_ sign_extend 32) (bvmul (_ bv2 32) sz)))))
(select x0 idx)
(select nondet_bytes (bvadd idx (bvneg ((_ zero_extend 8) start)))))))))
The post condition on the range [start, start+2*sz) is not expressed using a quantifier, instead, we introduce a free idx
constant and an assertion.
We see that the assert is guarded by a lengthy path condition (and true B149 B150 B151 B152 B153 B154 B155)
that encodes the reachability of that location under along all possible paths in the program.
(there are short-cutting operations and points of failure so there's many ways to reach the assert).
;; POST2 proof obligation: forall idx, idx1 < 0 || idx1 >= start + 2 * sz || !(x[idx + start] >= 43)
(declare-fun idx1 () (_ BitVec 32))
(define-fun B196 () Bool
;; path condition
(=> (and true B149 B150 B151 B152 B153 B154 B155)
(or
(not (bvsge idx1 (_ bv0 32)))
(bvsge idx1 (bvadd start (bvmul (_ bv2 32) sz)))
(not
(bvsge
((_ sign_extend 24)
(select
x1
(bvadd
((_ zero_extend 8) ((_ extract 55 0) start))
((_ sign_extend 32) idx1))))
(_ bv43 32))))))
(assert (not B196))
(check-sat)
So really both z3 and bitwuzla struggle to establish that (not B196)
is UNSAT by grounding the quantified axioms that encode the havoc operation by connecting x0
, nondet_bytes
and x1
and the post condition on a slice within x1
that's shifted right from start
bytes.
I've seen that the proof times are sensitive to the max array len
so I'm wondering wether the solvers end up proving it for each individual cell.
Maybe the encoding of the post condition could also use a quantifier instead of a ground constant for idx1
?
Any suggestions on how to improve the encoding ?
Hi, I played a little bit more with the encoding and was able to bring the runtime down to 1 second for bitwuzla, 20 seconds for z3 with some tweaks in the encoding, while scaling to the possible allocation size.
I was able to come up with this. The main changes are:
bump_A_B
on the nondet_byte
, before using it to havoc x
using __CPROVER_array_replace(x, nondet_bytes)
;#include <assert.h>
#include <stdint.h>
#include <stdlib.h>
#include <stdbool.h>
#define A 12
#define B 42
#define SZ (__CPROVER_max_malloc_size-1)
void harness() {
size_t len, start, sz;
__CPROVER_assume(2 <= len && len <= SZ);
__CPROVER_assume((len & 1) == 0);
__CPROVER_assume(0 <= start && start < len);
__CPROVER_assume(1 <= sz && sz <= len / 2);
__CPROVER_assume(start + 2 * sz <= len);
int8_t *x = malloc(len * sizeof(int8_t));
__CPROVER_assume(x != NULL);
__CPROVER_assume(__CPROVER_forall {
size_t i;
0 <= i && i < start ==> x[i] <= B
});
__CPROVER_assume(__CPROVER_forall {
size_t i;
start <= i && i < len ==> x[i] <= A
});
// inlined call arguments
int8_t *call_arg1 = x + start;
size_t call_arg2 = sz;
{
int8_t *x = call_arg1;
size_t sz = call_arg2;
__CPROVER_assert(
__CPROVER_forall {
size_t i;
0 <= i && i < 2 * sz ==> x[i] <= A
},
"precondition replaced call");
int8_t nondet_bytes[2*sz];
// assume the post condition directly on the havoced bytes
__CPROVER_assume(__CPROVER_forall {
size_t i;
(0 <= i && i < (2 * sz)) ==> nondet_bytes[i] <= B
});
__CPROVER_array_replace(x, nondet_bytes);
}
// assert post condition
bool post1 = __CPROVER_forall {
size_t i;
(0 <= i && i < (start + 2 * sz)) ==> x[i] <= B
};
assert(post1);
bool post2 = __CPROVER_forall {
size_t i;
((start + 2 * sz) <= i && i < len) ==> x[i] <= A
};
assert(post2);
}
** 0 of 36 failed (1 iterations)
VERIFICATION SUCCESSFUL
cbmc --function harness -DENC15 hanno_simpler.c --z3 -DMAX_SIZE 18.77s user 0.18s system 98% cpu 19.153 total
I think makes it easier for z3 to trigger quantifiers since guards match syntactically. Below I explain what I think are the problems with the original formulation.
The contract replacement of bump_A_B
is represented as
// inlined call to bump_A_B
int8_t *call_arg1 = x + start;
size_t call_arg2 = sz;
{
int8_t *x = call_arg1;
size_t sz = call_arg2;
assert( __CPROVER_forall {
size_t i;
0 <= i && i < 2 * sz ==> x[i] <= A
});
__CPOVER_havoc_slice(x, 2 * sz);
__CPROVER_assume(__CPROVER_forall {
size_t i;
(0 <= i && i < (2 * sz)) ==> x[i] <= B
});
}
__CPOVER_havoc_slice(x, 2*sz);
desugars to:
uint8_t nondet_bytes[2*sz];
__CPROVER_array_replace(x, nondet_bytes);
As described above, at the SMT-level, all arrays x
, nondet_bytes
, and the updated x
are represented using a pair formed of a functional SMT array and a size, and __CPROVER_array_replace(x,nondet_bytes)
turns into a quantified axiom linking all three arrays (using Int instead of BitVec
types here to simplify)
;; x before havoc_slice
(declare-const x!0 (Array Int Int))
(declare-const x_len!0 Int)
;; nondet byte array used by havoc_slice
(declare-const nondet_bytes (Array Int Int))
(declare-const nondet_bytes_len Int)
;; x after havoc_slice
(declare-const x!1 (Array Int Int))
(declare-const x_len!1 Int)
;; connect x!0, nondet_bytes, x!1 havoc_slice
(assert
(forall ((i Int))
(=> (and (<= 0 i) (< i x_len!1))
(= (select x!1 i)
(ite (and (<= start i) (< i (+ start (* 2 sz))))
(select nondet_bytes (- i start))
(select x!0 i))
The post condition of bump_A_B
is assumed on a slice of the array x!1
;; assumption post call
(assert
(forall ((i Int))
(=> (and (<= 0 i) (< i (* 2 sz)))
(<= (select x!1 (+ start i)) B)
The final proof obligation is expressed on x!1
and is almost the same as the post condition of bump_A_B
.
So we assumed something like:
;; bump_A_B post condition
__CPROVER_assume(__CPROVER_forall {
size_t i;
(0 <= i && i < (2 * sz)) ==> (x+start)[i] <= B
});
}
and we are trying to prove this
;; top level post condition
__CPROVER_assert(__CPROVER_forall {
size_t i;
(start <= i && i < start + (2 * sz)) ==> x[i] <= B
});
}
The start
offset moved from the RHS(x+start)[i]
LHS of the proof obligation (start <= i && i < start + (2 * sz))
, I think that's what prevents triggering and unification in the solver.
Moreover, instead of using the exact quantified formula we specified in the source, CBMC eliminates the quantifier by introducing a witness variable for the quantified index and grounding, while also losing the implicative structure.
(declare-const i Int)
(define-fun POST1 () Bool
(=> some_path_condition
(or (> start i) (>= i (+ start (* 2 sz)) (<= (select x!1 i) B)))
(assert (not POST1))
(check-sat)
Given that quantifier instantiation is really dependent on syntactic triggers I suspect that the z3 or bitwuzla cannot trigger anything anymore and are stuck enumerating counter examples and learning clauses one by one indirectly rather that discharging a forall by grouding a single forall.
int
instead of size_t
When using size_t
indices the runtime does not seem to increase with the max array size and is counted in seconds (even sub-second with bitwuzla).
SZ | bitwuzla | z3
--------|-----------|----------
256 | 0m0.807s |0m2.010s
512 | 0m0.828s |0m0.778s
1024 | 0m0.879s |0m2.765s
2048 | 0m0.875s |0m4.281s
4096 | 0m0.902s |0m30.581s
8192 | 0m1.017s |0m6.971s
16384 | 0m0.947s |0m2.056s
32768 | 0m1.007s |0m2.108s
65536 | 0m1.094s |0m2.388s
131072 | 0m1.032s |0m7.707s
262144 | 0m1.172s |0m27.049s
Using int
instead of size_t
sends the runtime to minutes. Using int
causes a lot more sign-extend, zero-extend, byte-extracts, ... in the formula which may make it harder to instantiate quantifiers properly.
The last observation I made is that z3 fares a little bit better with int
is used as index rather than size_t
. Some encoding patterns that benefited z3 were sometimes harming bitwuzla, so different solvers may require different encoding tricks.
I'm observing that wrapping a function operating on a buffer by another function which merely 'frames' it to operate on a slice of the buffer, leads to very slow proof times.
Here is a simple example, where the sub-function bumps a bound on the sub buffer. This is a minimal version of an example arising in practice when verifying the Number Theoretic Transform (https://github.com/pq-code-package/mlkem-native/pull/371).
I was expecting the proof to be almost instantaneous, but for
SZ==256
, which is the value I am working with in practice, I have not yet seen it terminate.The issue is the same between Z3 and Bitwuzla.
CBMC Version 6.3.1