GaloisInc / pate

Patches Assured up to Trace Equivalence
Other
14 stars 3 forks source link

Add a test suite for the Strongest Postcondition verifier #279

Closed travitch closed 2 years ago

travitch commented 2 years ago

Dan: To track which events are observable, the suggested approach was to define a data region which represents "observable global memory", and mark which tests are expected to have observable differences with respect to this region, and which are observably equivalent.

robdockins commented 2 years ago

While developing the SP verifier, there were a few test binaries I developed that should probably be included in the test suite. These can be found in the programtargets repository in the test-binaries directory. The results listed below are current as of 0d80f8dbe336d47a056a83567de1fe0ceae9b741.

$ cabal run pate -- -o ~/code/programtargets/test-binaries/test1/orig.exe -p ~/code/programtargets/test-binaries/test1/patch.exe -V Debug --log-file=pate.log --strongestpost

ConcreteAddress (segment1+0x0) observable sequences disagree
Original sequence:
  Syscall At: segment1+0xc segment1+0xc: SVC_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1111) cond 14, imm24 0
  0x2a:[32]
Patched sequence:
  Syscall At: segment1+0xc segment1+0xc: SVC_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1111) cond 14, imm24 0
  0x22b:[32]
Overall verification verdict: Inequivalent

This is the expected result.

$ cabal run pate -- -o ~/code/programtargets/test-binaries/test2/orig.exe \
      -p ~/code/programtargets/test-binaries/test2/patch.exe \
      -V Debug --log-file=pate.log --strongestpost

ConcreteAddress 0x100ac program control flow desynchronized
  Original: 0x100d0 branch Just (0x100f8,"0x100f8: B_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1010) cond 13, imm24 16777204")
  Patched:  0x100d0 branch Just (0x100f0,"0x100f0: B_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1010) cond 13, imm24 16777206")
Overall verification verdict: Inequivalent
cabal run pate -- -o ~/code/programtargets/test-binaries/test3/orig.exe \
      -p ~/code/programtargets/test-binaries/test3/patch.exe \
      -b ~/code/programtargets/test-binaries/test3/test3.toml \
      -V Debug --log-file=pate.log --strongestpost

ConcreteAddress 0x100d4 observable sequences disagree
Original sequence:
  Write 0x2012c:[32] <- 0x1:[32]
Patched sequence:
ConcreteAddress 0x100d4 program control flow desynchronized
  Original: 0x100e8 branch Just (0x10114,"0x10114: B_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1010) cond 13, imm24 16777203")
  Patched:  0x100e8 branch Just (0x10108,"0x10108: B_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1010) cond 13, imm24 16777206")
Overall verification verdict: Inequivalent

The observable sequence disagreement is expected (given that we declared the global variable to be observable memory), but the control-flow desync is not.

robdockins commented 2 years ago

If we back up to commit c0f3626c60114863b4d23dae53ee9d0f1517971f, we get results from test2 that I remember, which is a result of the stack handling problems I mentioned:

$ cabal run pate -- -o ~/code/programtargets/test-binaries/test2/orig.exe \
      -p ~/code/programtargets/test-binaries/test2/patch.exe \
      -V Debug --log-file=pate.log --strongestpost

ConcreteAddress 0x10084 observable sequences disagree
Original sequence:
  Syscall At: 0x10098 0x10098: SVC_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1111) cond 14, imm24 0
  0x0:[32]
Patched sequence:
  Syscall At: 0x10098 0x10098: SVC_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1111) cond 14, imm24 0
  0x80000000:[32]
ConcreteAddress 0x100d0 program control flow desynchronized
  Original: 0x0 function return Just (0x100f8,"0x100f8: B_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1010) cond 13, imm24 16777204")
  Patched:  0x100d0 branch Just (0x100f0,"0x100f0: B_A1(xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxx1010) cond 13, imm24 16777206")
Overall verification verdict: Inequivalent

After debugging this for awhile, I determined that the desync here is due to the verifier deciding that the loop counter may differ between the two runs; this in turn happens because the verifier doesn't understand that the local storage for the hoisted temporary variable doesn't alias with the local storage for the loop counter.

robdockins commented 2 years ago

For test3, if we back up to c0f3626c60114863b4d23dae53ee9d0f1517971f, we get results I expect, which show the distinguished global variable (which has been configured as observable) written at different times in the two programs.

$ cabal run pate -- -o ~/code/programtargets/test-binaries/test3/orig.exe \
      -p ~/code/programtargets/test-binaries/test3/patch.exe \
      -b ~/code/programtargets/test-binaries/test3/test3.toml \
      -V Debug --log-file=pate.log --strongestpost

ConcreteAddress 0x100d4 observable sequences disagree
Original sequence:
  Write 0x2012c:[32] <- 0x1:[32]
Patched sequence:
ConcreteAddress 0x100e8 observable sequences disagree
Original sequence:
  Write 0x2012c:[32] <- 0x1:[32]
Patched sequence:
Overall verification verdict: Inequivalent
robdockins commented 2 years ago

Overall, it seems like #271 actually caused a regression in control-flow recognition for these programs.

travitch commented 2 years ago

Based on your experience trying to debug these desync issues - is there any tooling we need to build to make that easier to track down?

robdockins commented 2 years ago

Based on your experience trying to debug these desync issues - is there any tooling we need to build to make that easier to track down?

When trying to figure out what was going on with test2, I spent a fair amount of time looking at the way that the abstract domains for the loop header changed over the run. Some way to get targeted reporting on points of interest would probably be helpful; I spent a lot of time recompiling to add/remove log file entries and scrolling through information I didn't care about. I also had to spend a lot of time scratching my head to figure out where the local variables I was interested in corresponded to stack slots. I guess that information could be included in DWARF metadata, but I don't know how to surface that to the user in a way that would have helped with my debugging.

robdockins commented 2 years ago

CF #278 for detailed post-mortem of the test2 example.