BinaryAnalysisPlatform / bap

Binary Analysis Platform
MIT License
2.05k stars 273 forks source link

Division by zero by Bil.Apply.binop returns result instead of failing #1029

Closed mjambon closed 4 years ago

mjambon commented 4 years ago

With bap 2.0, Bil.Apply.binop uses an implementation of division and remainder that no longer raises the exception Division_by_zero but returns a result.

open Bap.Std

let zero32 = Word.zero 32
let one32 = Word.one 32

let div1 () =
  Bil.Apply.binop Bil.DIVIDE zero32 one32

let div0 () =
  Bil.Apply.binop Bil.DIVIDE one32 zero32

let mod0 () =
  Bil.Apply.binop Bil.MOD one32 zero32

Of the functions above, div0 and mod0 are expected to signal an error via an exception or something, but we get this:

utop # div0 ();;
- : word = 0xFFFFFFFF
utop # mod0 ();;
- : word = 1

Bil.Apply.binop uses the Bitvec library and this is its documented behavior. Is it intentional, and if so, is there a better way to work around this than checking the value of the second operand ourselves?

ivg commented 4 years ago

Division by zero is a well-defined operation in the Core Theory. We followed Z3 and other SMTLIB implementations, which fixed the semantics of the division operation and came to the common ground solution (to which we adhered).

Please note, that this change doesn't change the semantics of programs, the lifters will still reflect the program semantics correctly, e.g., a division by zero instruction will raise a CPU exception. Only the semantics of bitvector operations is changed (which now relies on a new and more efficient Bitvec library).

mjambon commented 4 years ago

Thank you. So, we're going to wrap our call to Bil.Apply.binop so as to check for division by zero. I would suggest adding a note to the documentation for this function, specifying what semantics it implements and that it started with bap 2.0. An example of how to deal with division by zero would be ideal.

ivg commented 4 years ago

So, we're going to wrap our call to Bil.Apply.binop so as to check for division by zero.

Are you sure it is a good idea? First of all, the Division_by_zero was an undocumented side-effect, so it is strange that you were relying on it, if you were. The BIL semantics was never saying anything about division by zero, especially since this is the meta-language exception, not the BIL exception.

To summarizes, the failure of div0 () and mod0 () was a bug in the old versions of BAP, and reimplementation of a bug doesn't sound like a good idea to me :)

Concerning the semantics, it is preserved. The x86/arm or any other code will still behave the same. The Primus Machine will still raise the exception when the division by zero occurs. So nothing has changed.

Finally, the new semantics is synchronized with the Z3 semantics and semantics of other smtlib-compatible solvers, specifically to reduce the impedance mismatch between SMTLIB and BIL. So that the glue code could be simplified and optimized. In general, the semantics of the Bitvec libary closely follows the semantics of BV theory in SMTLIB, unlike the semantics of Zarith library.

Therefore, trying to restore the old behavior would be just counterproductive and error-prone. I would suggest you just to drop the offending tests, as they just encode an old buggy behavior for no good reasons.

mjambon commented 4 years ago

We use the Bil.Apply.* functions in our code to enumerate values that can be produced by all possible instantiations of variables. For example, our code would try to produce up to 256 values for the expression x / (x - y) (8-bit variables). Instantiations such as {x=42, y=42} are invalid since they lead to a division by zero. We clearly don't want a bogus value like 0xff when evaluating 42 / (42 - 42), we want to skip it.

mjambon commented 4 years ago

Note that the current behavior seems perfectly reasonable to me, and we have our own uncomplicated workaround. Seeing a note in the description of Bil.Apply.binop would have saved time.

ivg commented 4 years ago

We clearly don't want a bogus value like 0xff when evaluating 42 / (42 - 42), we want to skip it.

Well, this bogus value would be the value that z3 and any other modern solver will provide you for this expression, e.g.,

(set-logic QF_BV)
(set-option :produce-models true)

(declare-const x (_ BitVec 8))
(declare-const y (_ BitVec 8))
(assert (= x y))
(assert (= #xFF (bvudiv x (bvsub x y))))
(check-sat)
(get-model)
(exit)

gives us

z3 -smt2 division-by-zero.lis
sat
(model 
  (define-fun y () (_ BitVec 8)
    #x00)
  (define-fun x () (_ BitVec 8)
    #x00)
)

That's just to warn you, that your workaround may just hide the problem, you will be still getting this model from z3 :)

Seeing a note in the description of Bil.Apply.binop would have saved time.

Definitely, and thanks for raising this issue, now it is at least documented in the issue tracker. I will make sure that this is communicated through documentation. It's just we changed so long time ago (like two years ago), that I really forgot about it.

mjambon commented 4 years ago

Ah, yes, that's a great point. I'll keep it in mind. We might have to sprinkle constraints like denominator != 0 in some places.

ivg commented 4 years ago

FYI, the BIL code produced by a lifter already contains this checks,


$ bap-mc 'f7 7d f8' --show-bil --show-insn=asm
idivl -0x8(%rbp)
{
  #2 := extend:64[mem[RBP - 8, el]:u32]
  #1 := low:32[RDX].low:32[RAX]
  if (#2 = 0) {
    cpuexn (0)
  }
  else {
    #3 := #1 /$ #2
    #4 := #1 %$ #2
    if (0x7FFFFFFFFFFFFFFF <$ #3 | #3 <$ 0x8000000000000000) {
      cpuexn (0)
    }
    else {
      #5 := low:32[#4].low:32[#3]
      RAX := pad:64[extract:31:0[#5]]
      RDX := pad:64[extract:63:32[#5]]
    }
  }
  CF := unknown[bits]:u1
  OF := unknown[bits]:u1
  SF := unknown[bits]:u1
  ZF := unknown[bits]:u1
  AF := unknown[bits]:u1
  PF := unknown[bits]:u1
}

So theoretically, you shall never ever be able to get a division by zero if you execute code from the lifters.