GaloisInc / crucible

Crucible is a library for symbolic simulation of imperative programs
677 stars 44 forks source link

`lax-loads-and-stores` fails with `Undefined comparison between pointer and integer` #846

Open RyanGlScott opened 3 years ago

RyanGlScott commented 3 years ago

The aws-c-common/aws_hash_iter_delete_harness.i SV-COMP benchmark fails in the unreach-call category. Here is a minimal example:

// test.c
#include <stdlib.h>

typedef void(aws_hash_callback_destroy_fn)(void *key_or_value);
struct hash_table_state {
  aws_hash_callback_destroy_fn *destroy_key_fn;
};

void hash_proof_destroy_noop(void *p) {}

int main() {
  struct hash_table_state *hts = malloc(sizeof (struct hash_table_state));
  return hts->destroy_key_fn == hash_proof_destroy_noop;
}
-- test.config
lax-loads-and-stores: yes
$ crux-llvm test.c --config test.config
[Crux] Using pointer width: 64 for file crux-build/crux~test.bc
[Crux] Simulating function main
[Crux] Attempting to prove verification conditions.
[Crux] *** debug executable: results/test/debug-13
[Crux] *** break on line: 13
[Crux] Found counterexample for verification goal
[Crux]   test.c:13:30: error: in main
[Crux]   Undefined comparison between pointer and integer
[Crux] Goal status:
[Crux]   Total: 1
[Crux]   Proved: 0
[Crux]   Disproved: 1
[Crux]   Incomplete: 0
[Crux]   Unknown: 0
[Crux] Overall status: Invalid.

I've estimated that this is responsible for 320 of crux-llvm's failures in the unreach-call category.

robdockins commented 3 years ago

Ugh. More uninitialized data nonsense.

robdockins commented 3 years ago

The question of pointer equality is absolutely fraught with danger. The C spec indicates that it is UB to compare pointers two pointer values unless both are live pointers (or NULL). Pointers can even compare different if their bit patterns are equal (!) if they come from different allocations (EDIT actually, IIUC, it looks like C11 clarified this point and makes it so this shouldn't happen any more).

I really don't know what to do with this one.

travitch commented 3 years ago

We already added the dirty laxPointerOrdering option (https://github.com/GaloisInc/crucible/blob/master/crucible-llvm/src/Lang/Crucible/LLVM/MemModel/Options.hs#L25) - is this case any worse?

robdockins commented 3 years ago

On a closer reading, I think C11 has weaker restrictions than I remember. I don't know if I am recalling incorrectly, or if this changed from C99. For reference, the C11 says (section 6.5.9)

Para 2:

One of the following shall hold:
— both operands have arithmetic type;
— both operands are pointers to qualified or unqualified versions of compatible types;
— one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void; or
— one operand is a pointer and the other is a null pointer constant.

Para 3:

The == (equal to) and != (not equal to) operators are analogous to the relational operators except for their lower
precedence.108) Each of the operators yields 1 if the specified relation is true and 0 if it is false. The result has type
int. For any pair of operands, exactly one of the relations is true.

I read the final clause as saying that the situation in hand isn't actually undefined.

robdockins commented 3 years ago

We already added the dirty laxPointerOrdering option (https://github.com/GaloisInc/crucible/blob/master/crucible-llvm/src/Lang/Crucible/LLVM/MemModel/Options.hs#L25) - is this case any worse?

I think it is worse, but only because pointer equality tests happen a lot more than pointer order tests. It interacts very oddly with magic uninitialized data also, which (for now) we have decided always results in non-pointer bitvectors.

If we just did the obvious thing (pointers are different from nonzero bitvectors), we would always interpret this pointer equality test as false, which is probably unsound.

robdockins commented 3 years ago

To really do this right, I think we'll have to maintain an (abstract) mapping from block allocation numbers into the address space which would let us assign actual bitpatterns to each pointer value. Then, we'd have to assert, for every pair of objects that are live at the same time, that their ranges don't overlap. This means that allocation can't be a total function anymore.

On the plus side, we could actually handle (in principle) any kinds of nasty pointer bit-twiddly tricks people want to do and reason soundly up to all arrangements of allocation blocks in memory... on the negative side, it would be a complicated undertaking and would mostly produce a LOT of bookeeping assertions that would make solving a lot harder.