GaloisInc / saw-script

The SAW scripting language.
BSD 3-Clause "New" or "Revised" License
437 stars 63 forks source link

MIR override incorrectly rejected when applied to multiple `const` slice arguments #2064

Closed RyanGlScott closed 1 day ago

RyanGlScott commented 1 month ago

Consider this program:

// test.rs
const X: &[u32] = &[0; 4];
const Y: &[u32] = &[1; 4];

pub fn f(x: &[u32], y: &[u32]) -> bool {
    x[0] == y[0]
}

pub fn g() -> bool {
    f(X, Y)
}

I would expect SAW to be able to verify f and g against f_spec and g_spec below:

// test.saw
enable_experimental;

let f_spec = do {
  x_ref <- mir_alloc (mir_array 4 mir_u32);
  x_val <- mir_fresh_var "x" (mir_array 4 mir_u32);
  mir_points_to x_ref (mir_term x_val);

  y_ref <- mir_alloc (mir_array 4 mir_u32);
  y_val <- mir_fresh_var "y" (mir_array 4 mir_u32);
  mir_points_to y_ref (mir_term y_val);

  mir_execute_func [mir_slice_value x_ref, mir_slice_value y_ref];

  mir_return (mir_term {{ x_val@0 == y_val@0 }});
};

let g_spec = do {
  mir_execute_func [];

  mir_return (mir_term {{ False }});
};

m <- mir_load_module "test.linked-mir.json";

f_ov <- mir_verify m "test::f" [] false f_spec z3;
mir_verify m "test::g" [f_ov] false g_spec z3;

Surprisingly, however, SAW fails when using the f_ov override in a compositional proof of g_spec:

$ saw-rust test.rs
$ cabal run exe:saw -- test.saw
[17:07:05.306] Loading file "/home/ryanscott/Documents/Hacking/Haskell/sandbox/saw-script/test.saw"
[17:07:05.346] Verifying test/c3adb09d::f[0] ...
[17:07:05.360] Simulating test/c3adb09d::f[0] ...
[17:07:05.361] Checking proof obligations test/c3adb09d::f[0] ...
[17:07:05.377] Proof succeeded! test/c3adb09d::f[0]
[17:07:05.402] Verifying test/c3adb09d::g[0] ...
[17:07:05.411] Simulating test/c3adb09d::g[0] ...
[17:07:05.411] Matching 1 overrides of  test/c3adb09d::f[0] ...
[17:07:05.411] Stack trace:
"mir_verify" (/home/ryanscott/Documents/Hacking/Haskell/sandbox/saw-script/test.saw:27:1-27:11)
Symbolic execution failed.
Abort due to assertion failure:
  test.rs:10:10: 10:11: error: in test/c3adb09d::g[0]
  attempted to read empty mux tree

This is very strange. What's more, crux-mir is able to simulate the same program (using g as a #[crux::test] entrypoint), which suggests that the fault lies with SAW. Moreover, SAW is able to handle the following, simpler program without issues:

// works.rs
const X: &[u32] = &[0; 4];

pub fn f(x: &[u32]) -> u32 {
    x[0]
}

pub fn g() -> u32 {
    f(X)
}
// works.saw
enable_experimental;

let f_spec = do {
  x_ref <- mir_alloc (mir_array 4 mir_u32);
  x_val <- mir_fresh_var "x" (mir_array 4 mir_u32);
  mir_points_to x_ref (mir_term x_val);

  mir_execute_func [mir_slice_value x_ref];

  mir_return (mir_term {{ x_val@0 }});
};

let g_spec = do {
  mir_execute_func [];

  mir_return (mir_term {{ 0 : [32] }});
};

m <- mir_load_module "wat.linked-mir.json";

f_ov <- mir_verify m "wat::f" [] false f_spec z3;
mir_verify m "wat::g" [f_ov] false g_spec z3;

This suggests that the problem only arises when an override has multiple references as arguments. This led me to discover that if you comment out this line of code:

https://github.com/GaloisInc/saw-script/blob/45d64dd7e56c2879cd0a684be321d579899e3404/src/SAWScript/Crucible/MIR/Override.hs#L947

Then SAW will accept the specs in test.saw that it previously rejected. I'm not exactly sure what is going on, but this suggests that something is rather broken in the way that disjointness checking is currently implemented.

RyanGlScott commented 1 month ago

Some investigation reveals that the culprit is likely the fact that you cannot check Const_RefRoots for equality, nor can you check if they overlap. In another, more complicated test case (see https://github.com/GaloisInc/formal-verso/issues/4), we've managed to observe the Cannot compare Const_RefRoots error message that's used in these pieces of code. Note that we are using const string slices, which is one of the ways that a Const_RefRoot-based reference can arise in crucible-mir.

This limitation is unfortunate, and I'm unclear how to fix this issue without at least partially addressing this limitation. I decided to ask @spernsteiner what he thought:

Patches.md has some relevant notes:

... we don't have a good way to decide whether two Const_RefRoots are "the same object" or not. (We could assign them a Nonce for identity, but since the point of Const_RefRoot is to support muxing, we'd have to mux the nonces, which makes things much more complicated. And we can't simply declare that all Const_RefRoots containing the same value are the same object, because we don't have a generic equality test that covers all crucible types.)

It looks like Const_RefRoot is mostly produced by transConstVal. I wonder if we could get rid of Const_RefRoot entirely by bringing our const/static handling more in line with rustc. Specifically, now that rustc tracks explicit static allocations even for things like string literals, we could consider creating a Crucible global variable for each allocation and having transConstVal produce GlobalVar_RefRoot pointing to those allocations.

There's already a ConstStaticRef case in transConstVal, which would probably cover most cases. (It doesn't handle pointers into the interior of an allocation, but it looks like the other transConstVal cases don't currently handle that either.) The main change would probably be ensuring that mir-json emits all the right static allocations and uses ConstStaticRef instead of ConstSlice etc

RyanGlScott commented 1 month ago

Here is a rough outline for how we might go about fixing this:

spernsteiner commented 1 month ago
  • Anyway, about those mir-json code paths for constant slices. You can find them here and here. I suspect that we will want to change these so that we create top-level allocations for each constant slice

Ideally those str/bstr/slice special cases could all be deleted and all const allocs would be treated uniformly. Two things that might be blocking this:

  1. mir-json might not support fat pointer constants. From looking briefly at try_render_ref_opty, my guess is if you delete the special cases, it will output the pointer correctly but omit the metadata (slice/str length). The bstr case should just work, though (the length is statically known).
  2. IIRC crucible-mir has a special repr for pointers to slice/str, which might not be compatible with the standard struct/tuple reprs that already have working ConstVals.

I think ConstStr, ConstSlice, and ConstByteStr should ideally all be deleted, with the first two combined into a ConstPtrWithLen that takes a const pointer (usually ConstStaticRef, but it might sometimes be ConstRawPtr for empty/zst slices) and a usize length and combines them into the appropriate slice repr. Then mir-json can have a single case in try_render_ref_opty for "has length metadata" instead of separate slice and str cases (the bstr case I suspect can just be removed). This could potentially be expanded to cover custom DSTs (e.g. &RefCell<[T]>) and/or vtable metadata (&dyn Trait) in the future.

sauclovian-g commented 5 days ago

What I've done is:

I have kept ConstStrRef and ConstSliceRef separate because conflating string slices and ordinary array slices didn't seem like a good idea. They can definitely be folded together if you think that's safe/advisable/adequately future-proof.

There are still four other references to constMirRef so we can't get rid of Const_RefRoots entirely yet.