GaloisInc / pate

Patches Assured up to Trace Equivalence
Other
15 stars 3 forks source link

Allow stubs to declare global data that is propagated across CFARs in order to support condition propagation #435

Open danmatichuk opened 2 months ago

danmatichuk commented 2 months ago

Backwards propagation of equivalence conditions (or assumptions) involves binding the post-state of a CFAR to the pre-state variables of the equivalence condition for a subsequent CFAR. For simple predicates this is straightforward:

Problem

Consider the following CFARs:

A:
  original: 
    r0 <- r1
    goto B
  patched: 
    r0 <- 3
    goto B

B:
  original:
    put(r0)
  patched:
    put(2)

An equivalence condition for B would be: r0{O} == 2. If we propagate this backwards then we get an equivalence condition on A that says r1{O} == 2

When an equivalence condition contains additional "free" variables that are introduced as a result of an under-specified stub. These variables can't be meaningfully interpreted outside of the CFAR where they were introduced, and therefore it is not possible to generate sufficient pre-conditions in order to imply predicates that contain them. e.g. given an alternative definition for B:

B':
  original:
    read(r0,r1)
    put(r1)
  patched:
    read(2, r1)
    put(r1)

Where read(i,x) := x <- read_val(i) and read_val(n) is a fresh, uninterpreted function that returns and n-width bitvector, with the intended semantics that each read pulls input from some external source (e.g. a device or filesystem).

The equivalence condition for B' would therefore be read_val(r0{0}) == read_val(2). The problem with this specification is that read_val cannot be meaningfully propagated back to A: doing this naively results in an equivalence condition read_val(r1{0}) == read_val(2).

The problem with this condition is that it is unclear what read_val means outside of the scope of B', since indeed there may be other executions of read elsewhere in the program (or via alternative paths to B') which should be formally distinct as they each represent separate reads.

This is immediately apparent in #421: each time B' is symbolically executed a fresh read_val is introduced. The immediate consequence of this is that any assertions/assumptions added during one analysis step of B' will be meaningless on subsequent analysis steps. This can result in an infinite loop during propagation attempts, since the verifier can never prove that it has introduced a sufficient precondition to imply a desired postcondition.

Solution:

To handle this properly, each stub needs to be able to introduce ad-hoc global state that is tracked as part of the normal state propagation process. This is already implemented for malloc explicitly, which maintains an extra piece of ghost state representing the maximum number of allocations that have been made at any program point. To implement read properly we would therefore introduce an extra piece of ghost state representing an index into some uninterpreted stream of input. We could then define read as follows:

read(i,x) := 
    idx <- read_global_index
    x <- read_val(idx, n)
    read_global_index <- idx + 1

Where read_val is a globally-unique uninterpreted function, and read_global_index is an additional piece of global state that is tracked across CFARs (i.e. in the equivalence and value domains).