draperlaboratory / cbat_tools

Program analysis tools developed at Draper on the CBAT project.
MIT License
101 stars 14 forks source link

WIP: Smtlib Prelude Injection #340

Open philzook58 opened 3 years ago

philzook58 commented 3 years ago

Initial pass at adding a smtlib prelude. Ready for discussion at least I think. Currently on the test example, it prepends the following string to both the precond and postcond

(define-sort pointer () (_ BitVec 64))
(define-sort memsort () (Array (_ BitVec 64) (_ BitVec 8)))
(define-sort BV8 () (_ BitVec 8))
(define-sort BV16 () (_ BitVec 16))
(define-sort BV32 () (_ BitVec 32))
(define-sort BV64 () (_ BitVec 64))
(define-sort char () (_ BitVec 8))
(define-sort byte () (_ BitVec 8))
(define-sort int16 () (_ BitVec 16))
(define-sort int32 () (_ BitVec 32))
(define-sort int64 () (_ BitVec 64))
(define-fun init-mem-equal () Bool (= init_mem_orig init_mem_mod))
(define-fun mem-equal () Bool (= mem_orig mem_mod))
(define-fun init_main_argc_orig () (_ BitVec 64) init_RDI0)
(define-fun init_main_argv_orig () (_ BitVec 64) init_RSI0)
(define-fun main_argv_orig () (_ BitVec 64) RSI0)
(define-fun main_result_orig () (_ BitVec 64) RAX0)
(define-fun init_main_argc_mod () (_ BitVec 64) init_RDI047)
(define-fun init_main_argv_mod () (_ BitVec 64) init_RSI049)
(define-fun main_argv_mod () (_ BitVec 64) RSI049)
(define-fun main_result_mod () (_ BitVec 64) RAX043)

Note that bap prepends the name of the subroutine to the variable names. Perhaps we want to trim this off. We also want to inject the prelude for single binary mode.

Features that could still be added to the prelude:

(define-fun store8 ((mem memsort) (p pointer) (v ( BitVec 8))) (store mem p v)) (define-fun store16 ((mem memsort) (p pointer) (v ( BitVec 8))) (store8 (store mem p (extract 7 0 v)) (bvadd p (_ bv1 64)) (extract 15 8 v)) )

- struct accessor or offset synonyms. This information is accessible from the Args.t also in the presence of header files.
- pto-stack and pto-heap predicates. We should reuse the functions already in env.ml. It may end up looking something like:

(define-fun pto-stack ((addr pointer)) Bool
(and (bvule addr #x0000000040000000) (bvuge addr #x000000003F800000)))

(define-fun pto-heap ((addr pointer)) Bool (bvule addr (bvsub #x000000003F800000 #x0000000000000100)))

- Maybe generic arg0 arg1 arg2, etc for the typical calling convention of the abi detected.
- Other useful pre-canned predicates 

(define-fun args-equal () Bool (and (= arg0_orig arg0_mod) (= arg1_orig arg1_mod))); and so on (define-fun same-env () Bool (and mem-equal args-equal)) (define-fun retval-equal () Bool (= retval_orig retval_mod))


- We could perhaps replace our current strategy of replacing the _init _orig _mod variables with their actual names using this system as well. I chose not to do this at the moment since it does not enhance anything, although I think it would be a bit more elegant
philzook58 commented 3 years ago
fortunac commented 3 years ago
* The prelude is added if the user uses custom pre/post conds. I am not currently replicating the assumption that registers are equal, in fact I add no assuptions. The user needs to `(assert library-defined-thing)` in order for anything to change at all. I kind of think the default behavior of assuming registers equal should be turned off personally and instead create a library predicate called `init-registers-equal` if you want this behavior.

* Hyphens vs underscores. Lisp convention is hyphens, our convention for init_ mod_ is underscores. That is the difference. I tried to stick to hyphens for anything that wasn't init_ _mod _orig. main_argv is a bap generated name for which I kind of think we might like to strip the `main_`.

* pto is short for "points to" I'm channeling a bit this: https://cvc4.github.io/separation-logic.html  I thought that the original name of `on-stack` was a bit confusing since a local variable is "on the stack" but really the predicate is about pointers that point to the stack. I could expand the name to `points-to-stack`. Perhaps the clarity is worth the verbosity

Cool! This answers all of my questions. I think we should document when the prelude gets added and a list of the pre-defined predicates somewhere.

codyroux commented 2 years ago

I'm good with this, as soon as it's rebased.