Improve symbolic execution performance when running against concrete values

celinval commented 1 year ago

Requested feature: When running Kani assess, many unit tests take a long time to run. Since these tests run with concrete inputs, I would expect the performance to be somewhat similar to MIRI. Use case: Running tests to find undefined behavior + potential improvement to harnesses. Link to relevant documentation (Rust reference, Nomicon, RFC):

Test case:

// TODO

zhassan-aws commented 1 year ago

Here is one such example:

#[kani::proof]
#[kani::unwind(7)]
#[kani::solver(cadical)]
fn main() {
    let s = "Mary had a little lamb";
    let v: Vec<&str> = s.split(' ').collect();
    assert_eq!(v, ["Mary", "had", "a", "little", "lamb"]);
}

This takes ~77 seconds with cadical:

$ kani split.rs
...
Runtime Symex: 21.1936s
size of program expression: 583537 steps
slicing removed 412796 assignments
Generated 39013 VCC(s), 26663 remaining after simplification
Runtime Postprocess Equation: 0.695186s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 8.50542s
Running propositional reduction
Post-processing
Runtime Post-process: 20.7397s
Solving with CaDiCaL sc2021
3893095 variables, 33892927 clauses
SAT checker: instance is SATISFIABLE
Runtime Solver: 32.2184s
Runtime decision procedure: 40.7989s
Running propositional reduction
Solving with CaDiCaL sc2021
3893095 variables, 33892928 clauses
SAT checker: instance is UNSATISFIABLE
Runtime Solver: 0.626649s
Runtime decision procedure: 0.744567s
...
SUMMARY:
 ** 0 of 773 failed (14 unreachable)

VERIFICATION:- SUCCESSFUL
Verification Time: 76.816055s

~150 seconds with kissat:

SUMMARY:
 ** 0 of 773 failed (14 unreachable)

VERIFICATION:- SUCCESSFUL
Verification Time: 151.37282s

and ~635 seconds with minisat:


SUMMARY:
 ** 0 of 773 failed (14 unreachable)

VERIFICATION:- SUCCESSFUL
Verification Time: 634.53174s

MIRI takes 0.2 seconds on this example.

tautschnig commented 1 year ago

Given

Runtime Symex: 21.1936s
size of program expression: 583537 steps

we seem to manage to do ~27.5k (complex) instructions per second, which is certainly not below what we usually see. But: we can and should still seek to improve on that, so I'll happily take up that challenge. (Also, the number of instructions does not seem excessive: perf stat -- ./2363 reports 1540867 concrete machine instructions on my system for this example.

celinval commented 1 year ago

@tautschnig do you know why this is invoking the SAT solver? I don't know much about the symbolic execution engine, but I was hoping that it would be able to solve these cases on its own.

tautschnig commented 1 year ago

@tautschnig do you know why this is invoking the SAT solver? I don't know much about the symbolic execution engine, but I was hoping that it would be able to solve these cases on its own.

That's a good point that needs digging in a bit more: I believe that Kani's reachability assertions will mean we invoke the solver in some way, but then all the queries should be trivial (because the path conditions for those assertions should trivially be TRUE or FALSE, and the solver should have no clauses in the input formula).

zhassan-aws commented 1 year ago

I tried it without reachability checks as well, and it was slightly faster. It also performed a single SAT query:

$ kani split.rs --no-assertion-reach-checks
Runtime Symex: 20.8828s
size of program expression: 579251 steps
slicing removed 412796 assignments
Generated 34727 VCC(s), 22377 remaining after simplification
Runtime Postprocess Equation: 0.672573s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 8.46135s
Running propositional reduction
Post-processing
Runtime Post-process: 20.7269s
Solving with CaDiCaL sc2021
3889146 variables, 33877177 clauses
SAT checker: instance is UNSATISFIABLE
Runtime Solver: 38.9186s
Runtime decision procedure: 47.4509s
...
SUMMARY:
 ** 0 of 773 failed

VERIFICATION:- SUCCESSFUL
Verification Time: 72.50889s

zhassan-aws commented 1 year ago

Disabling all CBMC checks cuts the time in half:

$ kani split.rs --no-assertion-reach-checks --no-default-checks
...
Runtime Symex: 19.7691s
size of program expression: 559970 steps
slicing removed 412673 assignments
Generated 7491 VCC(s), 4215 remaining after simplification
Runtime Postprocess Equation: 0.575882s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 7.36501s
Running propositional reduction
Post-processing
Runtime Post-process: 0.398001s
Solving with CaDiCaL sc2021
3150503 variables, 22299522 clauses
SAT checker: instance is UNSATISFIABLE
Runtime Solver: 0.71818s
Runtime decision procedure: 8.14251s
...
SUMMARY:
 ** 0 of 152 failed

VERIFICATION:- SUCCESSFUL
Verification Time: 32.27915s

tautschnig commented 1 year ago

We need to look into overriding align_offset to always produce zero instead of doing arithmetic over pointers in a way that doesn't make much sense given CBMC does not actually produce a concrete heap layout.

tautschnig commented 1 year ago

With the fix from #2396 the "Verification Time" is now 1.9 seconds. When adding in #2395 the time decreases to 0.9 seconds.

zhassan-aws commented 1 year ago

Another example from #2235:

fn main() {
    let x = String::new().repeat(1);
    assert!(x.chars().nth(1).is_none());
}

This runs for ~~several minutes~~ more than two hours even if I turn off all optional checks --no-default-checks.

zhassan-aws commented 1 year ago

Another case is #2517.

model-checking / kani

Improve symbolic execution performance when running against concrete values #2363