Backwards propagation of equivalence conditions (or assumptions) involves binding the post-state of a CFAR to the pre-state variables of the equivalence condition for a subsequent CFAR. For simple predicates this is straightforward:
Problem
Consider the following CFARs:
A:
original:
r0 <- r1
goto B
patched:
r0 <- 3
goto B
B:
original:
put(r0)
patched:
put(2)
An equivalence condition for B would be: r0{O} == 2. If we propagate this backwards then we get an equivalence condition on A that says r1{O} == 2
When an equivalence condition contains additional "free" variables that are introduced as a result of an under-specified stub. These variables can't be meaningfully interpreted outside of the CFAR where they were introduced, and therefore it is not possible to generate sufficient pre-conditions in order to imply predicates that contain them. e.g. given an alternative definition for B:
Where read(i,x) := x <- read_val(i) and read_val(n) is a fresh, uninterpreted function that returns and n-width bitvector, with the intended semantics that each read pulls input from some external source (e.g. a device or filesystem).
The equivalence condition for B' would therefore be read_val(r0{0}) == read_val(2). The problem with this specification is that read_val cannot be meaningfully propagated back to A: doing this naively results in an equivalence condition
read_val(r1{0}) == read_val(2).
The problem with this condition is that it is unclear what read_val means outside of the scope of B', since indeed there may be other executions of read elsewhere in the program (or via alternative paths to B') which should be formally distinct as they each represent separate reads.
This is immediately apparent in #421: each time B' is symbolically executed a fresh read_val is introduced. The immediate consequence of this is that any assertions/assumptions added during one analysis step of B' will be meaningless on subsequent analysis steps. This can result in an infinite loop during propagation attempts, since the verifier can never prove that it has introduced a sufficient precondition to imply a desired postcondition.
Solution:
To handle this properly, each stub needs to be able to introduce ad-hoc global state that is tracked as part of the normal state propagation process. This is already implemented for malloc explicitly, which maintains an extra piece of ghost state representing the maximum number of allocations that have been made at any program point. To implement read properly we would therefore introduce an extra piece of ghost state representing an index into some uninterpreted stream of input. We could then define read as follows:
Where read_val is a globally-unique uninterpreted function, and read_global_index is an additional piece of global state that is tracked across CFARs (i.e. in the equivalence and value domains).
Backwards propagation of equivalence conditions (or assumptions) involves binding the post-state of a CFAR to the pre-state variables of the equivalence condition for a subsequent CFAR. For simple predicates this is straightforward:
Problem
Consider the following CFARs:
An equivalence condition for B would be:
r0{O} == 2
. If we propagate this backwards then we get an equivalence condition on A that saysr1{O} == 2
When an equivalence condition contains additional "free" variables that are introduced as a result of an under-specified stub. These variables can't be meaningfully interpreted outside of the CFAR where they were introduced, and therefore it is not possible to generate sufficient pre-conditions in order to imply predicates that contain them. e.g. given an alternative definition for B:
Where
read(i,x) := x <- read_val(i)
andread_val(n)
is a fresh, uninterpreted function that returns and n-width bitvector, with the intended semantics that eachread
pulls input from some external source (e.g. a device or filesystem).The equivalence condition for B' would therefore be
read_val(r0{0}) == read_val(2)
. The problem with this specification is thatread_val
cannot be meaningfully propagated back toA
: doing this naively results in an equivalence conditionread_val(r1{0}) == read_val(2)
.The problem with this condition is that it is unclear what
read_val
means outside of the scope ofB'
, since indeed there may be other executions ofread
elsewhere in the program (or via alternative paths toB'
) which should be formally distinct as they each represent separate reads.This is immediately apparent in #421: each time B' is symbolically executed a fresh
read_val
is introduced. The immediate consequence of this is that any assertions/assumptions added during one analysis step of B' will be meaningless on subsequent analysis steps. This can result in an infinite loop during propagation attempts, since the verifier can never prove that it has introduced a sufficient precondition to imply a desired postcondition.Solution:
To handle this properly, each stub needs to be able to introduce ad-hoc global state that is tracked as part of the normal state propagation process. This is already implemented for
malloc
explicitly, which maintains an extra piece of ghost state representing the maximum number of allocations that have been made at any program point. To implementread
properly we would therefore introduce an extra piece of ghost state representing an index into some uninterpreted stream of input. We could then defineread
as follows:Where
read_val
is a globally-unique uninterpreted function, andread_global_index
is an additional piece of global state that is tracked across CFARs (i.e. in the equivalence and value domains).