GaloisInc / saw-script

The SAW scripting language.
BSD 3-Clause "New" or "Revised" License
438 stars 63 forks source link

How to handle flexibly sized structs? #1081

Open RyanGlScott opened 3 years ago

RyanGlScott commented 3 years ago

Consider this program, inspired by libsignal-protocol-c:

#include <stdint.h>
#include <stdlib.h>

struct buffer {
    size_t len;
    uint8_t data[];
};

struct buffer *buffer_alloc(size_t len)
{
    struct buffer *buffer;

    buffer = malloc(sizeof(struct buffer) + (sizeof(uint8_t) * len));
    if(buffer) {
        buffer->len = len;
    }
    return buffer;
}

Is there a way to allow SAW to verify the use of malloc in buffer_alloc? Unlike many other uses of malloc, the amount of memory allocated depends on the value of an argument (len). This was my first attempt at doing so:

let buffer_alloc_spec (len : Int) : CrucibleSetup () = do {
  let lenval = {{ `(len) : [64] }};
  llvm_execute_func [llvm_term lenval];
  buf   <- llvm_alloc (llvm_array (eval_int {{(8 + `(len) : [64])}}) (llvm_int 8));
  datav <- llvm_fresh_var "data" (llvm_array len (llvm_int 8));
  llvm_points_to_untyped buf (llvm_struct_value [llvm_term lenval, llvm_term datav]);
  llvm_return buf;
};

m <- llvm_load_module "buffer.o";
llvm_verify m "buffer_alloc" [] false (buffer_alloc_spec 64) abc;

But that doesn't work:

$ clang -g -c -emit-llvm -o buffer.o buffer.c
$ saw buffer.saw

[17:02:44.326] Loading file "/home/rscott/Documents/Hacking/C/buffer.saw"
[17:02:44.349] Stack trace:
"llvm_verify" (/home/rscott/Documents/Hacking/C/buffer.saw:11:1-11:12):
"buffer_alloc_spec" (/home/rscott/Documents/Hacking/C/buffer.saw:11:40-11:57):
"llvm_points_to" (/home/rscott/Documents/Hacking/C/buffer.saw:6:3-6:17):
types not memory-compatible:
[72 x i8]
{ i64, [64 x i8] }

If I try llvm_points_to_untyped instead of llvm_points_to, I get:

$ saw buffer.saw

[17:02:52.125] Loading file "/home/rscott/Documents/Hacking/C/buffer.saw"
[17:02:52.149] Verifying buffer_alloc ...
[17:02:52.150] Simulating buffer_alloc ...
[17:02:52.150] Stack trace:
"llvm_verify" (/home/rscott/Documents/Hacking/C/buffer.saw:11:1-11:12):
at /home/rscott/Documents/Hacking/C/buffer.saw:6:3
error when loading through pointer that appeared in the override's points-to precondition(s):
Precondition:
  Pointer: concrete pointer: allocation = 6, offset = 0
  Pointee: {let { x@1 = Cryptol.TCNum 64
      }
   in Cryptol.ecNumber x@1 (Prelude.Vec 64 Prelude.Bool)
        (Cryptol.PLiteralSeqBool x@1) : [64], fresh:data#477 : [64][8]}

  Assertion made at: /home/rscott/Documents/Hacking/C/buffer.saw:6:3
Failure reason: 
  When reading through pointer: (6, 0x0:[64])
  in the  postcondition of an override
  Tried to read something of size: 72
  And type: { i64, [64 x i8] }
  Found 1 possibly matching allocation(s):
  - HeapAlloc 6 0x48:[64] Mutable 16-byte-aligned <malloc> buffer.c:13:14

I'm not sure how to figure out how to tell SAW what the appropriate size should be.

(I'm assigning this the "question" label because it's possible that I've simply overlooked something that makes it possible to do this.)

weaversa commented 3 years ago

llvm_points_to_at_type buf (llvm_type "{ i64, [64 x i8]* }") (llvm_struct_value [llvm_term lenval, llvm_term datav]); has the same problems as llvm_points_to_untyped. I haven't figured out a way to get this to work yet.

brianhuffman commented 3 years ago

While checking the postcondition, SAW is trying to read a 72-byte value from the buffer, but the C code has only initialized the first 8 bytes. That's why the read fails.

The error message could definitely be better.

weaversa commented 3 years ago

The call to malloc provides the shape (type) of the data returned. Even if the data isn't initialized (we can't say anything about its contents), it would still be nice to say that the amount of data being pointed to is what we expected. Is there some way to say that with saw?

robdockins commented 3 years ago

That information is captured in the llvm_alloc, I think...

buf   <- llvm_alloc (llvm_array (eval_int {{(8 + `(len) : [64])}}) (llvm_int 8));
RyanGlScott commented 3 years ago

For reference, here is a working version:

let buffer_alloc_spec (len : Int) : CrucibleSetup () = do {
  let lenval = {{ `(len) : [64] }};
  llvm_execute_func [llvm_term lenval];
  buf <- llvm_alloc (llvm_array (eval_int {{(8 + `(len) : [64])}}) (llvm_int 8));
  llvm_points_to_at_type (llvm_elem buf 0) (llvm_int 64) (llvm_term lenval);
  llvm_return buf;
};

m <- llvm_load_module "buffer.o";
llvm_verify m "buffer_alloc" [] false (buffer_alloc_spec 64) abc;
jldodds commented 3 years ago

Ideally the error message would say why the read failed. I think we may have fixed our spec if it read

When reading through pointer Tried to read It was allocated but not initialized

Unless there's something else to take from this issue we can close it and open a new one about the error message

brianhuffman commented 3 years ago

Well, it's a bit more complicated than "allocated but not initialized", because the memory was partially initialized. It seems like we might need to have some kind of structured datatype for memory-read failure reasons, and augment the Crucible-LLVM memory model to return one whenever a read fails.

robdockins commented 3 years ago

We have more detailed accounting for read failures; it's turned off in SAW because it consumes too much memory :-( I'm pretty sure you'd get a much more detailed error message from a similar situation in crux (where the more detailed errors are enabled).

brianhuffman commented 3 years ago

We should probably allow users to turn on detailed read failure messages as an option, and have the default memory read error message suggest turning it on if it's turned off.

robdockins commented 3 years ago

It'd not 100% the same error (it's hard to get C to actually do a bulk load of an array), but here's the output you'd have gotten from crux.

#include <stdint.h>
#include <stdlib.h>
#include <assert.h>

struct buffer {
    size_t len;
    uint8_t data[];
};

struct buffer *buffer_alloc(size_t len)
{
    struct buffer *buffer;

    buffer = malloc(sizeof(struct buffer) + (sizeof(uint8_t) * len));
    if(buffer) {
        buffer->len = len;
    }
    return buffer;
}

int main() {
  struct buffer* buf = buffer_alloc( 64 );

  assert( buf->len == 64 );

  for( int i=0; i<64; i++ ) {
    assert( buf->data[i] != 0 );
  }

}
[Crux] Found counterexample for verification goal
buffer.c:27:5: error: in main
Error during memory load
  No previous write to this location was found
    Attempting load at type: i8
    Via pointer: (10, 0x8:[64])
  Performing overall load at type: i8
    Via pointer: (10, 0x8:[64])
  In memory state:
    Stack frame main
      Allocations:
        HeapAlloc 10 0x48:[64] Mutable 16-byte-aligned <malloc> buffer.c:14:14
      Writes:
        Indexed chunk:
          10 |->   *(10, 0x0:[64]) := 0x40:[64]
    Base memory
      Allocations:
        GlobalAlloc 9 0x12:[64] Immutable 1-byte-aligned [global variable  ] .str.2
        GlobalAlloc 8 0xf:[64] Immutable 1-byte-aligned [global variable  ] .str.1
        GlobalAlloc 7 0x9:[64] Immutable 1-byte-aligned [global variable  ] .str
        GlobalAlloc 6 0x5:[64] Immutable 1-byte-aligned [global variable  ] __func__.main
        GlobalAlloc 5 0x0:[64] Immutable 1-byte-aligned [defined function ] main
        GlobalAlloc 4 0x0:[64] Immutable 1-byte-aligned [defined function ] buffer_alloc
        GlobalAlloc 3 0x0:[64] Immutable 1-byte-aligned [external function] __assert_rtn
        GlobalAlloc 2 0x0:[64] Immutable 1-byte-aligned [external function] malloc
        GlobalAlloc 1 0x0:[64] Immutable 1-byte-aligned [external function] llvm.dbg.value
      Writes:
        Indexed chunk:
          6 |->   *(6, 0x0:[64]) := "main\NUL"
          7 |->   *(7, 0x0:[64]) := "buffer.c\NUL"
          8 |->   *(8, 0x0:[64]) := "buf->len == 64\NUL"
          9 |->   *(9, 0x0:[64]) := "buf->data[i] != 0\NUL"