GaloisInc / saw-script

The SAW scripting language.
BSD 3-Clause "New" or "Revised" License
442 stars 63 forks source link

`llvm_compositional_extract`: surprising lack of return type detection #2091

Open RyanGlScott opened 3 months ago

RyanGlScott commented 3 months ago

Consider this simple C function, which does nothing but return the value 42:

// test.c
#include <stdint.h>

uint32_t f(void){
  return 42;
}

The goal is to extract f to a Term (which should also return the value 42) using the various LLVM-related extraction commands. Here is a script which demonstrates them all:

// test.saw
enable_experimental;

m <- llvm_load_module "test.bc";

print "";
print "llvm_extract";

f_extracted_term <- llvm_extract m "f";
print_term f_extracted_term;

print "";
print "llvm_compositional_extract (1)";

let f_spec_1 = do {
  llvm_execute_func [];

  llvm_return (llvm_term {{ 42 : [32] }});
};

llvm_compositional_extract m "f" "f_compositional_extracted_term_1" [] false f_spec_1 z3;
print_term (unfold_term ["f_compositional_extracted_term_1"] {{ f_compositional_extracted_term_1 }});

print "";
print "llvm_compositional_extract (2)";

let f_spec_2 = do {
  llvm_execute_func [];

  ret <- llvm_fresh_var "ret" (llvm_int 32);
  llvm_postcond {{ ret == 42 }};
  llvm_return (llvm_term ret);
};

llvm_compositional_extract m "f" "f_compositional_extracted_term_2" [] false f_spec_2 z3;
print_term (unfold_term ["f_compositional_extracted_term_2"] {{ f_compositional_extracted_term_2 }});

Surprisingly, not all of the f-related Terms return 42:

$ ~/Software/saw-1.1/bin/saw test.saw
[14:01:32.057] Loading file "/home/ryanscott/Documents/Hacking/SAW/test.saw"
[14:01:32.059] 
[14:01:32.059] llvm_extract
[14:01:32.060] bvNat 32 42
[14:01:32.060] 
[14:01:32.060] llvm_compositional_extract (1)
[14:01:32.086] Verifying f ...
[14:01:32.086] Simulating f ...
[14:01:32.087] Checking proof obligations f ...
[14:01:32.087] Proof succeeded! f
[14:01:32.139] (-empty-)
[14:01:32.139] 
[14:01:32.139] llvm_compositional_extract (2)
[14:01:32.174] Verifying f ...
[14:01:32.174] Simulating f ...
[14:01:32.175] Checking proof obligations f ...
[14:01:32.175] Proof succeeded! f
[14:01:32.239] bvNat 32 42

f_extracted_term (which is extracted using llvm_extract) and f_compositional_extracted_term_2 (which is extracted using llvm_compositional_extract) both return 42, as expected. f_compositional_extracted_term_1, on the other hand, returns () instead of 42! This is very surprising, considering that f_spec_1 is only a minor variation on f_spec_2, and it's arguably more natural to write f_spec_1 than f_spec_2.

SAW's :help output for the llvm_compositional_extract function has this to say:

    llvm_compositional_extract : LLVMModule -> String -> String -> [LLVMSpec] -> Bool -> LLVMSetup () -> ProofScript () -> TopLevel LLVMSpec

Translate an LLVM function directly to a Term. The parameters of the
Term are the input parameters of the LLVM function: the parameters
passed by value (in the order given by `llvm_exec_func`), then
the parameters passed by reference (in the order given by
`llvm_points_to`). The Term is the tuple consisting of the
output parameters of the LLVM function: the return parameter, then
the parameters passed by reference (in the order given by
`llvm_points_to`).

I suppose there is some ambiguity about what "return parameter" means in this context. Nevertheless, I would argue that llvm_return (llvm_term {{ 42 : [32] }}) should appear in the return type of the extracted Term. This would be far less surprising and would be more consistent with the behavior of llvm_extract.