GaloisInc / saw-script

The SAW scripting language.
BSD 3-Clause "New" or "Revised" License
442 stars 63 forks source link

SAW panics on implicit `char*` to `char` conversion #1851

Open bboston7 opened 1 year ago

bboston7 commented 1 year ago

Given the C program:

void test(void) {
  char c = "some string";
}

and SAW proof:

m <- llvm_load_module "test.bc";                                                   

let spec = do { 
  llvm_execute_func [];
};

llvm_verify m "test" [] true spec z3;

SAW panics with:

[21:17:31.450] You have encountered a bug in Crucible's implementation.
*** Please create an issue at https://github.com/GaloisInc/crucible/issues

%< --------------------------------------------------- 
  Revision:  f4145fbed96245f385a91ef3a32e6737df1075ff
  Branch:    HEAD
  Location:  MemModel.packMemValue
  Message:   Type mismatch when storing value.
             *** Expected storable type: bitvectorType 1
             *** Given crucible type: IntrinsicRepr LLVM_pointer [BVRepr 64]
CallStack (from HasCallStack):
  panic, called at src/Lang/Crucible/Panic.hs:11:9 in crucible-0.6-inplace:Lang.Crucible.Panic
  panic, called at src/Lang/Crucible/LLVM/MemModel.hs:1622:3 in crucible-llvm-0.4-inplace:Lang.Crucible.LLVM.MemModel
%< --------------------------------------------------- 

Note that the char c declaration is not a typo. Although Clang complains about the implicit conversion from char* to char, it does compile. I don't think SAW should panic on C code that compiles, so we should either support this conversion, or produce a useful error about the cast.

RyanGlScott commented 1 year ago

This isn't a SAW bug so much as it is a Crucible one. In fact, you can also trigger this panic using just crux-llvm if you tweak the code slightly:

int main(void) {
  char c = "some string";
  return 0;
}
$ crux-llvm -O0 test.c 
[Crux] Using pointer width: 64 for file crux-build/crux~test.bc
[Crux] Simulating function main
You have encountered a bug in Crucible's implementation.
*** Please create an issue at https://github.com/GaloisInc/crucible/issues

%< --------------------------------------------------- 
  Revision:  ad4a553487eeb5c6bbb5abf4bde26af905bf0254
  Branch:    master (uncommited files present)
  Location:  MemModel.packMemValue
  Message:   Type mismatch when storing value.
             *** Expected storable type: bitvectorType 1
             *** Given crucible type: IntrinsicRepr LLVM_pointer [BVRepr 64]
CallStack (from HasCallStack):
  panic, called at src/Lang/Crucible/Panic.hs:11:9 in crucible-0.6-inplace:Lang.Crucible.Panic
  panic, called at src/Lang/Crucible/LLVM/MemModel.hs:1622:3 in crucible-llvm-0.4-inplace:Lang.Crucible.LLVM.MemModel
%< ---------------------------------------------------

Here is the LLVM bitcode that results from this code:

define dso_local i32 @main() #0 {
entry:
  %retval = alloca i32, align 4
  %c = alloca i8, align 1
  store i32 0, i32* %retval, align 4
  store i8 ptrtoint ([12 x i8]* @.str to i8), i8* %c, align 1
  ret i32 0
}

Because of the use of ptrtoint, I suspect the culprit is this code:

    L.PtrToInt -> return x

When crucible-llvm translates ptrtoint <cexpr> in a constant expression, it simply returns <cexpr>, regardless of what type <cexpr> is. This feels extremely sketchy, and moreover, it performs fewer checks than the code that translates non-constant uses of ptrtoint:

    L.PtrToInt -> do
       llvmTypeAsRepr outty $ \outty' ->
         case (asScalar x, outty') of
           (Scalar _archProxy (LLVMPointerRepr w) _, LLVMPointerRepr w')
              | Just Refl <- testEquality w PtrWidth
              , Just Refl <- testEquality w' PtrWidth -> return x
           _ -> fail (unlines ["pointer-to-integer conversion failed", showI])

This actually checks to see if the type of the argument is a pointer of the same width as the output type, explicitly failing if this is not the case. And indeed, if you tweak the original example further:

int main(void) {
  char *c1 = "some string";
  char c2 = c1;
  return 0;
}

You will trigger this explicit failure:

$ crux-llvm -O0 test.c 
[Crux] Using pointer width: 64 for file crux-build/crux~test.bc
[Crux] Attempting to prove verification conditions.
[Crux] *** debug executable: results/test/debug-1
[Crux] *** break on line: 1
[Crux] Found counterexample for verification goal
[Crux]   :1:0: error: in _start
[Crux]   Failure encountered while generating a Crucible CFG: at test.c:3:13: pointer-to-integer conversion failed
[Crux]   ptrtoint i8* %4 to i8
[Crux] 
[Crux] Goal status:
[Crux]   Total: 1
[Crux]   Proved: 0
[Crux]   Disproved: 1
[Crux]   Incomplete: 0
[Crux]   Unknown: 0
[Crux] Overall status: Invalid.

Ideally, the original example would trigger a similar failure.

RyanGlScott commented 1 year ago

A relevant commit is https://github.com/GaloisInc/crucible/commit/65bda9d22c2b3755e6fff855bdb54224359d8337, which removes some checks during constant ptrtoint translation. (I'm not yet sure if the program above would have been rejected under these checks.)

RyanGlScott commented 1 year ago

A quick patch that suffices to reject the original program:

diff --git a/crucible-llvm/src/Lang/Crucible/LLVM/Translation/Constant.hs b/crucible-llvm/src/Lang/Crucible/LLVM/Translation/Constant.hs
index 895d6453..6cfc9625 100644
--- a/crucible-llvm/src/Lang/Crucible/LLVM/Translation/Constant.hs
+++ b/crucible-llvm/src/Lang/Crucible/LLVM/Translation/Constant.hs
@@ -852,7 +852,12 @@ evalConv expr op mt x = case op of
       -> return $ FloatConst f

     L.IntToPtr -> return x
-    L.PtrToInt -> return x
+    L.PtrToInt
+      |  IntType n <- mt
+      ,  n == natValue ?ptrWidth
+      -> return x
+      |  otherwise
+      -> throwError "Unequal sizes"

     _ -> badExp "unexpected conversion operation"

I'm not yet clear if this is sufficiently permissive to cover everything that we care about, but at the very least, the crux-llvm test suite continues to pass with this change.


As a side note, the LLVM Language Reference Manual entry for ptrtoint actually doesn't require that the size of the pointer be equal to the size of the int, as the instruction is meant to perform zero extension or truncation as necessary on the pointer argument. In this sense, crucible-llvm is more restrictive than LLVM is. Still, it might be better to reject such a program unless someone explicitly asks for it to work.