More precise handling of stack

To reproduce the results discussed here, use commit c0f3626c60114863b4d23dae53ee9d0f1517971f and programtargets commit 41765a26fd93c58bd4f0c34233c2f2b29ba96f9e.

$ cabal run pate -- -o ~/code/programtargets/test-binaries/test2/orig.exe \
      -p ~/code/programtargets/test-binaries/test2/patch.exe \
      -V Debug --log-file=pate.log --strongestpost

ConcreteAddress 0x10084 observable sequences disagree
Original sequence:
  Syscall At: 0x10098 0x10098: SVC_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1111) cond 14, imm24 0
  0x0:[32]
Patched sequence:
  Syscall At: 0x10098 0x10098: SVC_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1111) cond 14, imm24 0
  0x80000000:[32]
ConcreteAddress 0x100d0 program control flow desynchronized
  Original: 0x0 function return Just (0x100f8,"0x100f8: B_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1010) cond 13, imm24 16777204")
  Patched:  0x100d0 branch Just (0x100f0,"0x100f0: B_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1010) cond 13, imm24 16777206")
Overall verification verdict: Inequivalent

This example involves a simple C program compiled with -O0 (so all locals are spilled into the stack) where we hoist a store out of a loop. The original function is:

int f (int x) {
  int i = 0;
  int v = x;
  int ret = v;
  for(; i<10; i++) {
    x++;
    ret = x;
  }
  ret = x;
  return ret;
}

The patched program simply deletes the ret = x; line from the body of the loop. From examining the program, we can relatively easily see that these stores are redundant (under reasonable assumptions about concurrent actors not reading our local stack).

The SP verifier will cut this program into two chunks because of the loop back-edge: the first (call it chunk 1) starting at the program entry and executing one iteration of the loop before terminating at the loop header (this chunk will not exit and return because it knows the loop variable starts at 0); and the second chunk (chunk 2) starting at the loop header, executing one loop iteration and either returning to the loop header or exiting the loop and returning.

In both chunks we will learn that the slot corresponding to ret will be different along control paths returning into the loop. However, these will be relative to different abstract stack pointers, because each chunk invents its own. Once this happens, the verifier still needs to reach fixpoint, and will reexamine chunk 2. Now, it traverses the loop body, but it now doesn't know how the two abstract stack pointers relate to each other, so it believes that any of the stack slots it accesses in chunk 2 could alias with the ret slot it learned about from the chunk 1. As a result, it decides all the local variables in the function may diverge, including the loop counter i. Once again, we have to reexamine chunk 2, and at this point we cannot guarantee that control flow remains synchronized because we lost track of the loop counter. This results in the "desync" report seen above, where control flow might either branch backward back into the loop, or return from the function.

Because we lost track of the entire stack, the return value is also considered divergent, and this return value eventually flows into a system call, which results in the observable divergence report.

Doing better here will require a more careful tracking of the state of the stack across chunk boundaries.

GaloisInc / pate

More precise handling of stack #278