CBMC keeps running forever when there is a loop without a bound

feliperodri commented 1 year ago

CBMC version: N/A. Operating system: N/A.

// test.c
int main()
{
  int a;
  for (int i = 0; i < a; i++)
    {
      a++;
    }

}

Exact command line resulting in the issue: cbmc test.c What behaviour did you expect: CBMC would throw and error for loops without a bound instead of keeping running forever. What happened instead: CBMC keeps running forever.

tautschnig commented 1 year ago

I am wondering what the "bug" here is. Is it that the above loop is bounded, it's just that the bound is INT_MAX, and CBMC will still continue beyond this bound? (Might be easier to experiment with when using char instead of int.)

feliperodri commented 1 year ago

Instead of using the *_MAX for a type, why not provide an error back to the user? I think it's more painful to get a CBMC run that goes on forever, than an error message for a loop where CBMC would add an arbitrary large bound.

tautschnig commented 1 year ago

What are the exact conditions under which CBMC should produce an error message? The question whether a loop is (un)bounded is undecidable itself.

feliperodri commented 1 year ago

In the unwindsett class, we consider 3 limits:

class unwindsett
{
public:
  // We have
  // 1) a global limit
  // 2) a limit per loop, all threads
  // 3) a limit for a particular thread.
  // We use the most specific of the above.

I'd expect that if none of them are provided, instead of considering a global limit of *_MAX, we could report error. Would that avoid making CBMC "running forever" when trying to unwind a loop?

feliperodri commented 1 year ago

@zhassan-aws

zhassan-aws commented 1 year ago

IMHO, the case that is more relevant is that in which loops have bounds, but running CBMC without --uwnind causes it to run forever, even though we can get a "full" proof (i.e. the unwinding assertions pass) with an --unwind value specified, e.g.:

#include <assert.h>

int main() {
    int coin;
    int y = coin ? 10 : 20;
    int sum = 0;
    int i = 1;
    while (sum < y) {
        sum += i;
        ++i;
    }
    assert(sum == 10 || sum == 21);
    return 0;
}

$ cbmc iter.c
CBMC version 5.71.0 (cbmc-5.71.0) 64-bit x86_64 linux
Parsing iter.c
Converting
Type-checking iter
Generating GOTO Program
Adding CPROVER library (x86_64)
Removal of function pointers and virtual functions
Generic Property Instrumentation
Running with 8 object bits, 56 offset bits (default)
Starting Bounded Model Checking
Unwinding loop main.0 iteration 1 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 2 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 3 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 4 file iter.c line 8 function main thread 0
...
Unwinding loop main.0 iteration 2077 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 2078 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 2079 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 2080 file iter.c line 8 function main thread 0

whereas if we run it with --unwind 7 we get a full proof:

$ cbmc iter.c --unwind 7 --unwinding-assertions
CBMC version 5.71.0 (cbmc-5.71.0) 64-bit x86_64 linux
Parsing iter.c
Converting
Type-checking iter
Generating GOTO Program
Adding CPROVER library (x86_64)
Removal of function pointers and virtual functions
Generic Property Instrumentation
Running with 8 object bits, 56 offset bits (default)
Starting Bounded Model Checking
Unwinding loop main.0 iteration 1 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 2 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 3 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 4 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 5 file iter.c line 8 function main thread 0
Unwinding loop main.0 iteration 6 file iter.c line 8 function main thread 0
Not unwinding loop main.0 iteration 7 file iter.c line 8 function main thread 0
Runtime Symex: 0.00179222s
size of program expression: 92 steps
simple slicing removed 5 assignments
Generated 2 VCC(s), 2 remaining after simplification
Runtime Postprocess Equation: 1.065e-05s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 0.000384178s
Running propositional reduction
Post-processing
Runtime Post-process: 2.4181e-05s
Solving with MiniSAT 2.2.1 with simplifier
443 variables, 315 clauses
SAT checker: instance is UNSATISFIABLE
Runtime Solver: 0.000271092s
Runtime decision procedure: 0.000719273s

** Results:
iter.c function main
[main.unwind.0] line 8 unwinding assertion loop 0: SUCCESS
[main.assertion.1] line 12 assertion sum == 10 || sum == 21: SUCCESS

** 0 of 2 failed (1 iterations)
VERIFICATION SUCCESSFUL

This is a case that would benefit from doing incremental unwinding by default.

tautschnig commented 1 year ago

I do wonder how many cases we might be able to catch if we kept around an interval abstract domain element for each non-constant integer. The above example is not the first of such a kind.

remi-delmas-3000 commented 1 year ago

How about adding a mode of operation where symex would interleave satisfiability checks and unwinding ?

martin-cs commented 1 year ago

@tautschnig : we can do an arbitrary amount of compute during unwinding, I guess the question is how much is useful beyond what constant folding / expression forwarding in symex.

/unhelpful/ We could just replace the pointer analysis with a pass with VSD which would give you this too! /unhelpful/

martin-cs commented 1 year ago

@remi-delmas-3000 : in principle it is a good idea, this is what the incremental modes do. The problem is that if your program isn't:

doStuff();
while (condition) {
  everythingImportant();
}

then you have to do the symex phase because of things like:

int p = &a;
for (int i = 0; i < BIG_COUNTER; ++i) {
  whatever();
  if (i == ARBITRARY) {
    p = &b;
  }
}

while (*p != '\0') {
  doStuff(*p);
  ++p;
}

Unrolling loops in the middle of a program may change the SSA in an awkward way. This means that you can't reuse the solver and learnt information.

With the current symex engine it would be need to be implemented at quite a high level of control. It would work but might be unpleasantly slow for a default use. A tighter and more elegant implementation is possible but requires ... some big changes to symex.

diffblue / cbmc

CBMC keeps running forever when there is a loop without a bound #7406