diffblue / cbmc

C Bounded Model Checker
https://diffblue.github.io/cbmc
Other
847 stars 262 forks source link

`natural_loop`s not computed correctly in presence of function pointers [blocks #5530] #6168

Closed SaswatPadhi closed 3 years ago

SaswatPadhi commented 3 years ago

CBMC version: 556b4325 Operating system: Mac OS 10.15.7

Test case:

#include <assert.h>
#include <stdlib.h>

static int adder(const int *a, const int *b)
{
  return (*a + *b);
}

int main()
{
  int x = 1024;

  int (*local_adder)(const int *, const int *) = adder;

  while(x > 0)
  __CPROVER_loop_invariant(1 == 1)
  {
    x += local_adder(&x, &x); // loop detection fails
    //x += adder(&x, &x);       // works fine
  }
}

Exact command line resulting in the issue:

$ goto-cc test.c -o test.gb
$ goto-instrument --enforce-all-contracts test.gb test.2.gb
$ cbmc test.2.gb

What behaviour did you expect:

The loop invariant would be used and unwinding wouldn't be necessary.

What happened instead:

The loop invariant is not used because the loop in main is not recognized.

Note that if we use adder in place of local_adder within the loop, then the loop is detected and the loop invariant is processed as expected.

Additional context:

If we use adder in place of local_adder above, i.e. call the function directly as opposed to using a function pointer, then loop is detected and invariant is processed as expected.

The following instrumentation code:

for(auto &goto_function : goto_model.goto_functions.function_map) {
  natural_loops_mutablet natural_loops(goto_function.second.body);
  log.warning() << goto_function.first.c_str() << " has "
                << natural_loops.loop_map.size() << " loops."
                << messaget::eom;
}

produces the following output when using local_adder function pointer

Reading GOTO program from 'test.gb'
main has 0 loops.
adder has 0 loops.
__CPROVER__start has 0 loops.
__CPROVER_initialize has 0 loops.
Writing GOTO program to 'test.2.gb'

and produces the following output when using adder directly

Reading GOTO program from 'test.gb'
main has 1 loops.
adder has 0 loops.
__CPROVER__start has 0 loops.
__CPROVER_initialize has 0 loops.
Writing GOTO program to 'test.2.gb'
SaswatPadhi commented 3 years ago

cc: @feliperodri @tautschnig

feliperodri commented 3 years ago

cc. @chrisr-diffblue

martin-cs commented 3 years ago

That's horrible :-\

Conjecture : your problem is here:

https://github.com/diffblue/cbmc/blob/0ae6010a4e3977e591d9f4dd6029f6cfc6dc4508/src/goto-programs/cfg.h#L397

cfgt is called by cfg_dominatorst which is called by natural_loops. It fails to detect an edge and thus it is not a loop.

Fixing it at the source is ... going to be unpleasant, ironically for the same reason that this bug happened in the first place.

It is not clear what the "correct" or "normal" form(s) for `goto_modelt` are. Various passes make assumptions about what other passes have or haven't been run. These are not documented or enforced and mainly result in wrong results rather than errors. Cleaning this up is going to be a big job. The changes to `process_goto_program` were to try to normalise this a bit but did result in a couple of bits of weird behaviour. I haven't started on `goto-instrument` yet because it probably has the largest amount of 'variety' in what does and doesn't get run. Like fixing how function pointers are handled, it's on my TODO list but unless someone wants to pay me, it's not going to happen immediately.

My suggested work-around is to insert calls to goto_instrument_parse_optionst::do_indirect_call_and_rtti_removal before the natural loop and contract code.

tautschnig commented 3 years ago

Joining this conversation with some delay. @martin-cs Many thanks for root causing this! While I agree that "normal" form(s) aren't properly defined for now, we should review why cfg_dominatorst even is using an inter-procedural CFG. I would like to think that procedure-local CFGs should be good enough for both dominators and natural loops? I'll take a proper look at the code.

martin-cs commented 3 years ago

My guess (not recalling that I had worked on this!) is that the edge computation is general purpose and interprocedural and used for many things and natural loop computation just uses that without really thinking about the consequences.

NlightNFotis commented 3 years ago

It looks like the issue should be worked around using @martin-cs' suggestion after the merge of #6340 .

Let us know if this has been fixed adequately for your use case, or if we need to be doing something more.

jimgrundy commented 3 years ago

@SaswatPadhi, can you confirm this works now. And, if so, close the ticket.

SaswatPadhi commented 3 years ago

The fix seems to work for the loop contracts I added to s2n_set_binary_search, so I am closing this issue now.

Thanks @NlightNFotis and @martin-cs!