diffblue / cbmc

C Bounded Model Checker
https://diffblue.github.io/cbmc
Other
829 stars 262 forks source link

[Question] Generate logic equations from C code? #7803

Open PeterMacGonagan opened 1 year ago

PeterMacGonagan commented 1 year ago

Hello,

Is it possible to obtain the logic equations of a C code with CBMC? If so, how? Here's an simple example:

#ifndef CBMC
    #include <stdlib.h>
    #include <assert.h>
    #include <stdio.h>
    #define nondet_int() 0
#else
    //...
#endif

#define BOOL short
#define false 0
#define true 1

int main()
{
    BOOL ok;
    int a, b, c;

#ifdef CBMC
    a = nondet_int();
    b = nondet_int();
#else
    //possible solution...
    a = 3;
    b = 7;
#endif

    c = a + b;
    ok = (c == 10);
    assert(ok == false); //find a condition to prove that c==a+b (can be equal to...)
}

From what I've read, goto-instrument would be able to do this, but I haven't found any more information.

Thank you :)

martin-cs commented 1 year ago

Yes. What format do you want them in?

Human readable : use --show-vcc (I would suggest against trying to parse this) DIMACS : use --dimacs --outfile - SMT2 : use --smt2 --outfile - (If you want to parse the formulae, I would recommend this)

Alternatively, if you want to use the equations directly in your code, you can integrate it as a new back-end and save yourself the work of parsing, building expression representations, etc.

I'm sure you already realise this but just for anyone who finds this in the future, what is generated are the equations describing the unrolled / symbolically executed program to a given limit (--unwind 10 or --depth 20), not the program itself.

PeterMacGonagan commented 1 year ago

Thank you for your answer. I'm sorry but I can't get the SSA using --smt2 or --show-vcc.

I only get these results for show-vcc by example: file cbmcexample.c line 42 function main assertion (signed int)ok == 0 {-1} __CPROVER_dead_object#1 = NULL {-2} __CPROVER_deallocated#1 = NULL {-3} __CPROVER_max_malloc_size#1 = 36028797018963968 {-4} __CPROVER_memory_leak#1 = NULL {-5} __CPROVER_rounding_mode!0#1 = 0 {-6} main::1::a!0@1#2 = nondet_symbol identifier="symex::nondet0" {-7} main::1::b!0@1#2 = nondet_symbol identifier="symex::nondet1" {-8} main::1::cref!0@1#2 = nondet_symbol identifier="symex::nondet2" {-9} main::1::c!0@1#2 = main::1::a!0@1#2 + main::1::b!0@1#2 {-10} main::1::ok!0@1#2 = (main::1::c!0@1#2 = main::1::cref!0@1#2 ? 1 : 0) ├────────────────────────── {1} cast(main::1::ok!0@1#2, signedbv[32]) = 0

Note: it's a little bit modified version of my c code.

martin-cs commented 1 year ago

I feel like I may not understand what you are looking for. What --show-vcc generates is a form of SSA; that's why there are multiple copies of each variable with numbers to distinguish which versions they are. If you have multiple paths you will see the ITEs which are effectively \phi nodes.

When you say "logic equations" -- what format / semantics are you looking for?

PeterMacGonagan commented 1 year ago

My goal: I've got "complicated" C-code and I would like to generate the boolean equations. (or I would like to get the SSA to generate these boolean equations)

By example, I would like to get the boolean equations of an addition of 2 bytes:

A      + B      = R 
8 bits + 8bits  = 8bits + 1bit carry/ignored.

with:

 A= {A7, ... A0}
 B= {B7, ... B0}
 R= {R7, ... R0}

I already know the result (full adder equations):

CIN0 = 0
R0 = CIN0 XOR A0 XOR B0
COUT0 = A0 . B0 + CIN0 . (A0 XOR B0)

CIN1=COUT0
R1 = CIN1 XOR A1 XOR B1
COUT1 = A1 . B1 + CIN1 . (A1 XOR B1)

...

CIN6=COUT5
R6 = CIN6 XOR A6 XOR B6
COUT6 = A6 . B6 + CIN6 . (A6 XOR B6)

CIN7=COUT6
R7 = CIN7 XOR A7 XOR B7
COUT7 = ignored (or overflow)

I suppose that CBMC has these equations just before translate them with Tseytin transformation (I suppose).

martin-cs commented 1 year ago

If you want the equations over bits then --dimacs is what you are looking for and it is given in CNF (in the standard form used by SAT solvers). If you want gates you could recover them from this or you could resurrect the AIG back-end ( removed in https://github.com/diffblue/cbmc/pull/2681 ) and have that dump them out directly.

Does that help?

thomasspriggs commented 1 year ago

The original question has been answered. I am going to close this out, due to a lack of further responses from @PeterMacGonagan Please feel free to re-open, if you wish to discuss this further.

PeterMacGonagan commented 1 year ago

Thank you very much for your answer. I did a lot of research and I tried a lot of softwares: sis, abc, ... Vivaldi, etc. But I don't really find a solution for my 'problematic' (but it was very interesting).

Could you point me where in the code of cbmc the 'SSA' are translated into cnf, please?

Thank you in advance.

martin-cs commented 1 year ago

The conversion code is in src/solvers and there is some documentation for how it works and the key interfaces here : https://github.com/diffblue/cbmc/blob/develop/src/solvers/README.md

src/solvers/flattening and boolbvt are probably a reasonably good place to start looking. You might also be interested in bv_utilst.