runtimeverification / wasm-semantics

A Prototype Formal Semantics of WebAssembly in K
Other
78 stars 23 forks source link

Machine integers or "True" integers? #184

Open MrChico opened 5 years ago

MrChico commented 5 years ago

In K's builtin domains.k library we find a suite of functions implementing machine integers of arbitrary bit width represented in 2's complement. For example:

  /*@
   * Addition, subtraction, and multiplication are the same for signed and
   * unsigned integers represented in 2's complement
   */
  syntax MInt ::= addMInt(MInt, MInt)   [function, hook(MINT.add), smt-hook(bvadd)]
                | subMInt(MInt, MInt)   [function, hook(MINT.sub), smt-hook(bvsub)]
                | mulMInt(MInt, MInt)   [function, hook(MINT.mul), smt-hook(bvmul)]

Currently, this repo is not using these builtins, but rather, similar to evm-semantics, uses "True" integer representations + modulo operations.

I'm curious what the best approach is here. In writing proofs of EVM programs, a lot of effort goes in into writing lemmas that deals with quite simple arithmetic, which can otherwise be proven directly using the right SMT definitions. On the other hand, SMT theories dealing with large bitvectors (such as the 256-bit words of the EVM) quickly start being more difficult to reason with than modulo integer arithmetic.

There might also be speedups of concrete executions to consider in this choice. I expect the MINT library to be faster than custom made extra rules.

I'm tentatively in favor of replacing the custom rules dealing with machine integers in numeric.md in favor of the MINT module of domains.k, but I'm curious of others people's thoughts here. @ehildenb @hjorthjort @dwightguth

ehildenb commented 5 years ago

The MInt module hasn't been tested much in the verification setting. Certainly we shouldn't try with KWasm first, instead should try with KEVM and check that the same proofs go through.

ehildenb commented 5 years ago

Although, note that KWasm basically gives a specialization of MInt for the i32 and i64 case. Instead of storing the width, we store the type, which I would argue is a better approach overall. We can also hook it to the correct SMT-lib libraries if we want bit-vector reasoning, though my understanding was that bit-vector reasoning is usually slower, esp if in normal operation you're usually doing arithmetic that doesn't overflow, and only occasionally do something where the overflow conditions kick in.