`--object-bits` and `--unwind` flag missing at checking safety properties and coverage step

model-checking / cbmc-starter-kit

The CBMC starter kit makes it easy to add CBMC verification to a software project.

https://model-checking.github.io/cbmc-starter-kit/

MIT No Attribution

44 stars 21 forks source link

`--object-bits` and `--unwind` flag missing at checking safety properties and coverage step #209

Open QinyuanWu opened 3 months ago

QinyuanWu commented 3 months ago

I have an issue with the make pipeline for running a single proof: although I've specified CBMC_OBJECT_BITS in the makefile but it did not get translated into the --object-bits flag during the pipeline and the checking safety properties step fails due to insufficient object bits. The --unwind flag is also missing when I turn on --dfcc and --apply-loop-contract. Is this a bug with the pipeline or am I missing something? The same issue happens at the calculating coverage step. Appreciate any guidance.

[14/16] SymCryptMd2: checking safety properties                                                                                                                    FAILED: /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/logs/result.xml /tmp/litani/runs/b68774f7-53d7-40d9-a60e-2ebbde6cc678/status/3bc78467-10f9-4ad0-8fe4-e60b42f060dc.json 

/usr/libexec/litani/litani exec --command 'cbmc  --flush --unwinding-assertions  --bounds-check           --trace --xml-ui /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/gotos/SymCryptMd2_harness.goto' --pipeline-name SymCryptMd2 --ci-stage test --job-id 3bc78467-10f9-4ad0-8fe4-e60b42f060dc --stdout-file /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/logs/result.xml --stderr-file /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/logs/result-err-log.txt --description 'SymCryptMd2: checking safety properties' --timeout 21600 --status-file /tmp/litani/runs/b68774f7-53d7-40d9-a60e-2ebbde6cc678/status/3bc78467-10f9-4ad0-8fe4-e60b42f060dc.json --profile-memory-interval 10 --inputs /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/gotos/SymCryptMd2_harness.goto --outputs /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/logs/result.xml --ignore-returns 10 --tags 'stats-group:safety checks'

[15/16] SymCryptMd2: calculating coverage                                                                                                                                                                                                                                                      FAILED: /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/logs/coverage.xml /tmp/litani/runs/e72e57eb-8aa7-4fbd-af07-2c66148fab37/status/47a40d29-4740-432d-94d1-b134df518d55.json 

/usr/libexec/litani/litani exec --command 'cbmc  --flush  --cover location --xml-ui /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/gotos/SymCryptMd2_harness.goto' --pipeline-name SymCryptMd2 --ci-stage test --job-id 47a40d29-4740-432d-94d1-b134df518d55 --stdout-file /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/logs/coverage.xml --stderr-file /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/logs/coverage-err-log.txt --description 'SymCryptMd2: calculating coverage' --timeout 21600 --status-file /tmp/litani/runs/e72e57eb-8aa7-4fbd-af07-2c66148fab37/status/47a40d29-4740-432d-94d1-b134df518d55.json --profile-memory-interval 10 --inputs /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/gotos/SymCryptMd2_harness.goto --outputs /home/jiggly/SymCrypt-CBMC/CBMC/proofs/SymCryptMd2/logs/coverage.xml --ignore-returns 10 --tags 'stats-group:coverage computation'

Makefile for SymCryptMd2, Makefile-project-define CBMC version: 6.0.1 Running platform: WSL ubuntu

QinyuanWu commented 3 months ago

I saw the following comment in the Makefile:

# The default unwind should only be used in DFCC mode without loop contracts.
# When loop contracts are applied, we only unwind specified loops.
# If any loops remain after loop contracts have been applied, CBMC might try
# to unwind the program indefinetly, because we do not pass default unwind
# (i.e., --unwind 1) to CBMC when in DFCC mode.
# We must not use a default unwind command in DFCC mode, because contract instrumentation
# introduces loops encoding write set inclusion checks that must be dynamically unwound during
# symex.
ifneq ($(strip $(USE_DYNAMIC_FRAMES)),)
ifneq ($(strip $(APPLY_LOOP_CONTRACTS)),)
  UNWIND_0500_FLAGS=$(CBMC_UNWINDSET) $(CBMC_CPROVER_LIBRARY_UNWINDSET) $(CBMC_FLAG_UNWINDING_ASSERTIONS)
  UNWIND_0500_DESC="$(PROOF_UID): unwinding specified subset of loops"
else
  UNWIND_0500_FLAGS=$(CBMC_UNWINDSET) $(CBMC_CPROVER_LIBRARY_UNWINDSET) $(CBMC_DEFAULT_UNWIND) $(CBMC_FLAG_UNWINDING_ASSERTIONS)
  UNWIND_0500_DESC="$(PROOF_UID): unwinding all loops"
endif
endif

@qinheping since we are using both --dfcc and--apply-loop-contractand we also want to have --unwind 50, is there a way we can resolve this conflict?

qinheping commented 3 months ago

@qinheping since we are using both --dfcc and--apply-loop-contractand we also want to have --unwind 50, is there a way we can resolve this conflict?

Could you try to identify loops beside those in the dfcc built-in function with the CBMC flag --show-loops and set loop bounds for them with CBMC_UNWINDSET.

tautschnig commented 3 months ago

I suspect that your use of ?= in all the variable assignments causes some of the variables to be left empty instead. I suggest that your Makefile use = instead.

QinyuanWu commented 3 months ago

@qinheping currently there's no non-built-in loops left, the only one not related to dfcc is memcpy but I suppose the CBMC version of it one should be unbounded already?

QinyuanWu commented 3 months ago

@tautschnig for the --object-bits I realized Makefile.common by default had commented out CBMCFLAGS += --object-bits $(CBMC_OBJECT_BITS) so uncomment that line resolved the issue. I also observe that using = instead of ?= in the individual proof Makefile will not be able to overwrite the flag for some reason so ?= is necessary.

@qinheping I ended up modifying the analysis command in Makefile.common to add the --unwind flag and the pipeline was able to generate the report:

# Targets to run the analysis commands

ifdef CBMCFLAGS
  ifeq ($(strip $(CODE_CONTRACTS)),)
    CBMCFLAGS += $(CBMC_UNWINDSET) $(CBMC_CPROVER_LIBRARY_UNWINDSET) $(CBMC_DEFAULT_UNWIND) $(CBMC_OPT_CONFIG_LIBRARY)
  else ifeq ($(strip $(USE_DYNAMIC_FRAMES)),)
    CBMCFLAGS += $(CBMC_CPROVER_LIBRARY_UNWINDSET) $(CBMC_OPT_CONFIG_LIBRARY) $(CBMC_DEFAULT_UNWIND) #added $(CBMC_DEFAULT_UNWIND) here
  endif
endif

Is this okey practice or would you foresee any issue with adding this flag for other proofs that uses loop contract and dfcc? I still don't know why we need --unwind 50 but without it the program will not stop.

qinheping commented 3 months ago

@qinheping currently there's no non-built-in loops left, the only one not related to dfcc is memcpy but I suppose the CBMC version of it one should be unbounded already?

All loops except for memcmp in the image are loops introduced by dfcc loop contracts, so you don't need to set bounds for them. Unlike memcpy and memset, memcmp is not unbounded.

QinyuanWu commented 3 months ago

Is the unbounded version of memcmp something that's in development or is there any issue with applying loop contract to it?

qinheping commented 3 months ago

Is the unbounded version of memcmp something that's in development or is there any issue with applying loop contract to it?

We don't have any plan for unbounded version of memcmp currently. You can stub it out and apply loop contracts in the stubbed version. Here is an example of memcmp annotated with loop contracts for proving memory safety.

QinyuanWu commented 3 months ago

Thank you! I was attempting the loop contract myself and came up with this version:

#ifndef __CPROVER_STRING_H_INCLUDED
#include <string.h>
#define __CPROVER_STRING_H_INCLUDED
#endif

#undef memcmp

inline int memcmp(const void *s1, const void *s2, size_t n)
{
  __CPROVER_HIDE:;
  int res=0;
  #ifdef __CPROVER_STRING_ABSTRACTION
  __CPROVER_precondition(__CPROVER_buffer_size(s1)>=n,
                         "memcmp buffer overflow of 1st argument");
  __CPROVER_precondition(__CPROVER_buffer_size(s2)>=n,
                         "memcmp buffer overflow of 2nd argument");
  #else
  __CPROVER_precondition(__CPROVER_r_ok(s1, n),
                         "memcmp region 1 readable");
  __CPROVER_precondition(__CPROVER_r_ok(s2, n),
                         "memcpy region 2 readable");

  const unsigned char *sc1=s1, *sc2=s2;
  for(; n!=0; n--)
  __CPROVER_assigns(n, res, sc1, sc2)
  __CPROVER_loop_invariant(n >= 0 && n <= __CPROVER_loop_entry(n))
  __CPROVER_loop_invariant(__CPROVER_same_object(sc1, __CPROVER_loop_entry(sc1)))
  __CPROVER_loop_invariant(__CPROVER_same_object(sc2, __CPROVER_loop_entry(sc2)))
  __CPROVER_loop_invariant(__CPROVER_POINTER_OFFSET(sc1) <= __CPROVER_loop_entry(n) 
      && __CPROVER_POINTER_OFFSET(sc2) <= __CPROVER_loop_entry(n))
  __CPROVER_loop_invariant(__CPROVER_POINTER_OFFSET(sc1) == __CPROVER_POINTER_OFFSET(sc2) 
      && __CPROVER_POINTER_OFFSET(sc1) == __CPROVER_loop_entry(n) - n)
  __CPROVER_decreases(n)
  {
    res = (*sc1++) - (*sc2++);
    if (res != 0)
      return res;
  }
  #endif
  return res;
}

int main() {
  unsigned str1_len;
  unsigned str2_len;
  __CPROVER_assume(str1_len <= str2_len);

   char str1[str1_len];
   char str2[str2_len];

   unsigned len;
   __CPROVER_assume(len <= str1_len);

   memset(str1, "a", str1_len);
   memset(str2, "a", str2_len);

   memcmp(str1, str2, len);
   return 0;
}

I was originally missing the __CPROVER_same_object and the last loop invariant before but after adding these the verification passed. Another thing I find is that __CPROVER_loop_invariant(res == 0) did not change the verification result and since loop invariant is also checked after the last iteration of the loop, would __CPROVER_loop_invariant(res == 0) potentially cause the invariant to not hold if the loop did not immediate return when res != 0? In other words, if a loop returns or breaks in the middle of an iteration, does CBMC not check the invariant for this last iteration?

qinheping commented 3 months ago

In other words, if a loop returns or breaks in the middle of an iteration, does CBMC not check the invariant for this last iteration?

No, CBMC doesn't check if the loop invariant hold after break or return.

QinyuanWu commented 3 months ago

Got it. When I was experimenting with the loop contract, I found that if we switched the last invariant clause to the commented line then the loop would fail to be inductive. I'm curious to understand what makes the loop invariant fail with the commented one.

  const unsigned char *sc1=s1, *sc2=s2;
  for(; n!=0; n--)
  __CPROVER_assigns(n, res, sc1, sc2)
  __CPROVER_loop_invariant(n >= 0 && n <= __CPROVER_loop_entry(n))
  __CPROVER_loop_invariant(__CPROVER_same_object(sc1, __CPROVER_loop_entry(sc1)))
  __CPROVER_loop_invariant(__CPROVER_same_object(sc2, __CPROVER_loop_entry(sc2)))
  //__CPROVER_loop_invariant(__CPROVER_POINTER_OFFSET(sc1) <= __CPROVER_loop_entry(n) 
      // && __CPROVER_POINTER_OFFSET(sc2) <= __CPROVER_loop_entry(n)) why is this not inductive?
  __CPROVER_loop_invariant(__CPROVER_POINTER_OFFSET(sc1) == __CPROVER_POINTER_OFFSET(sc2) 
      && __CPROVER_POINTER_OFFSET(sc1) == __CPROVER_loop_entry(n) - n) // this makes the loop inductive
  __CPROVER_decreases(n)
  {
    res = (*sc1++) - (*sc2++);
    if (res != 0)
      return res;
  }

error message when using the commented out clause:

[memcmp.loop_invariant_step.1] line 108 Check invariant after step for loop memcmp.0: FAILURE
[memcmp.pointer_dereference.41] line 118 dereference failure: pointer outside object bounds in *tmp_post_sc1: FAILURE
[memcmp.pointer_dereference.47] line 118 dereference failure: pointer outside object bounds in *tmp_post_sc2: FAILURE

I also tried to switch the last loop invariant to __CPROVER_loop_invariant(__CPROVER_POINTER_OFFSET(sc1) == __CPROVER_POINTER_OFFSET(sc2) && __CPROVER_POINTER_OFFSET(sc2) <= __CPROVER_loop_entry(n)) but it also fails with the same errors. What's the key piece that's missing here? I have a feeling that it has something to do with <= __CPROVER_loop_entry(n) but I can't reason why.

error trace:

Violated property:
  file playground.c function memcmp line 119 thread 0
  dereference failure: pointer outside object bounds in *tmp_post_sc1
  (unsigned __CPROVER_bitvector[65])__CPROVER_OBJECT_SIZE(tmp_post_sc1) >= (unsigned __CPROVER_bitvector[65])__CPROVER_POINTER_OFFSET(tmp_post_sc1) + 1

Violated property:
  file playground.c function memcmp line 119 thread 0
  dereference failure: pointer outside object bounds in *tmp_post_sc2
  (unsigned __CPROVER_bitvector[65])__CPROVER_OBJECT_SIZE(tmp_post_sc2) >= (unsigned __CPROVER_bitvector[65])__CPROVER_POINTER_OFFSET(tmp_post_sc2) + 1

State 773 file playground.c function memcmp line 108 thread 0
----------------------------------------------------
  set=&__write_set_loop_0!0@1 (00000111 00000000 00000000 00000000 00000000 00000000 00000000 00000000)

State 774 file playground.c function memcmp line 108 thread 0
----------------------------------------------------
  ptr=(const void *)&n!0@1 (00010000 00000000 00000000 00000000 00000000 00000000 00000000 00000000)

State 775 file playground.c function memcmp line 108 thread 0
----------------------------------------------------
  size=8ul (00000000 00000000 00000000 00000000 00000000 00000000 00000000 00001000)

Violated property:
  file playground.c function memcmp line 108 thread 0
  Check invariant after step for loop memcmp.0
  n >= (unsigned long int)0 && n <= tmp_cc$1 && __CPROVER_POINTER_OBJECT((const void *)sc1) == __CPROVER_POINTER_OBJECT((const void *)tmp_cc$2) && __CPROVER_POINTER_OBJECT((const void *)sc2) == __CPROVER_POINTER_OBJECT((const void *)tmp_cc$3) && __CPROVER_POINTER_OFFSET((const void *)sc1) == __CPROVER_POINTER_OFFSET((const void *)sc2) && __CPROVER_POINTER_OFFSET((const void *)sc2) <= tmp_cc$1

QinyuanWu commented 3 months ago

Okey after some brainstorming, I had the following thoughts: In the case __CPROVER_POINTER_OFFSET(sc2) <= __CPROVER_loop_entry(n), sc2 may havoc to a value of __CPROVER_loop_entry(n) before the execution of the loop, in which after the execution it will become __CPROVER_POINTER_OFFSET(sc2) > __CPROVER_loop_entry(n) due to *sc2++ which fails the invariant. Same case for sc1. If we then change this invariant clause to __CPROVER_POINTER_OFFSET(sc2) < __CPROVER_loop_entry(n), after the last iteration it should always be the case where __CPROVER_POINTER_OFFSET(sc2) == __CPROVER_loop_entry(n) which fails the invariant.

Could you please confirm if my reasoning is correct?

qinheping commented 3 months ago

Yes, you are correct. If you look at the trace, the last assignment to sc2 in the trace will confirm your thought.

QinyuanWu commented 3 months ago

Great thank you very much!

QinyuanWu commented 3 months ago

I realized an issue with using __CPROVER_POINTER_OFFSET(sc1) == __CPROVER_POINTER_OFFSET(sc2) && __CPROVER_POINTER_OFFSET(sc1) == __CPROVER_loop_entry(n) - n instead of sc1 -(const unsigned char*)s1 == sc2 - (const unsigned char*)s2 && sc1 -(const unsigned char*)s1 == __CPROVER_loop_entry(n) - n is that the invariant would break if the pointer passed into memcmp has moved past the beginning of the array since then the offset would not be equal to begin with. While I used the alternative invariant, however, it gave the error of pointer relation for this invariant(__CPROVER_loop_invariant(sc1 <= s1 + __CPROVER_loop_entry(n) && sc2 <= s2 + __CPROVER_loop_entry(n)):

[memcmp.pointer_arithmetic.95] line 117 pointer relation: pointer outside object bounds in sc1: FAILURE
[memcmp.pointer_arithmetic.107] line 118 pointer relation: pointer outside object bounds in sc2: FAILURE

const unsigned char *sc1=s1, *sc2=s2;
  for(; n!=0; n--)
  __CPROVER_assigns(n, res, sc1, sc2)
  __CPROVER_loop_invariant(n >= 0 && n <= __CPROVER_loop_entry(n))
  __CPROVER_loop_invariant(__CPROVER_same_object(sc1, __CPROVER_loop_entry(sc1)))
  __CPROVER_loop_invariant(__CPROVER_same_object(sc2, __CPROVER_loop_entry(sc2)))
  __CPROVER_loop_invariant(sc1 <= s1 + __CPROVER_loop_entry(n)) //pointer relation: pointer outside object bounds in sc1: FAILURE
  __CPROVER_loop_invariant(sc2 <= s2 + __CPROVER_loop_entry(n)) //pointer relation: pointer outside object bounds in sc2: FAILURE
  __CPROVER_loop_invariant(sc1 - (const unsigned char*)s1 == sc2 - (const unsigned char*)s2
      && sc1 -(const unsigned char*)s1 == __CPROVER_loop_entry(n) - n)
  __CPROVER_decreases(n)
  {
    res = (*sc1++) - (*sc2++);
    if (res != 0)
      return res;
  }
  #endif
  return res;

qinheping commented 3 months ago

If theses two failures are the only failures, I suppose the reason could be n less than the readable length of s1 and s2. Could you provide the full harness that lead to the failures? I

QinyuanWu commented 3 months ago

Full implementation and harness:

#ifndef __CPROVER_STRING_H_INCLUDED
#include <string.h>
#define __CPROVER_STRING_H_INCLUDED
#endif

#undef memcmp

inline int memcmp(const char *s1, const char *s2, size_t n)
{
  __CPROVER_HIDE:;
  int res=0;
  #ifdef __CPROVER_STRING_ABSTRACTION
  __CPROVER_precondition(__CPROVER_buffer_size(s1)>=n,
                         "memcmp buffer overflow of 1st argument");
  __CPROVER_precondition(__CPROVER_buffer_size(s2)>=n,
                         "memcmp buffer overflow of 2nd argument");
  #else
  __CPROVER_precondition(__CPROVER_r_ok(s1, n),
                         "memcmp region 1 readable");
  __CPROVER_precondition(__CPROVER_r_ok(s2, n),
                         "memcpy region 2 readable");

  const char *sc1=s1, *sc2=s2;
  for(; n!=0; n--)
  __CPROVER_assigns(n, res, sc1, sc2)
  __CPROVER_loop_invariant(n >= 0 && n <= __CPROVER_loop_entry(n))
  __CPROVER_loop_invariant(__CPROVER_same_object(sc1, __CPROVER_loop_entry(sc1)))
  __CPROVER_loop_invariant(__CPROVER_same_object(sc2, __CPROVER_loop_entry(sc2)))
  __CPROVER_loop_invariant(sc1 <= s1 + __CPROVER_loop_entry(n))
  __CPROVER_loop_invariant(sc2 <= s2 + __CPROVER_loop_entry(n))
  __CPROVER_loop_invariant(sc1 - s1 == sc2 - s2 && sc1 - s1 == __CPROVER_loop_entry(n) - n)
  __CPROVER_decreases(n)
  {
    res = (*sc1++) - (*sc2++);
    if (res != 0)
      return res;
  }
  #endif
  return res;
}

int main() {
  unsigned str1_len;
  unsigned str2_len;
  __CPROVER_assume(str1_len <= str2_len);

  unsigned char *str1 = malloc(str1_len);
  unsigned char *str2 = malloc(str2_len);

   unsigned len;
   __CPROVER_assume(len <= str1_len);
   __CPROVER_assume(str1 != NULL);
   __CPROVER_assume(str2 != NULL);

   memset(str1, "a", str1_len);
   memset(str2, "a", str2_len);
   memcmp(str1, str2, len);
   return 0;
}

qinheping commented 3 months ago

The issue here is that sc1 in the loop invariants are checked if it is in the bound before we assume that it is in the bound. Replace the two predicates

  __CPROVER_loop_invariant(sc1 <= s1 + __CPROVER_loop_entry(n))
  __CPROVER_loop_invariant(sc2 <= s2 + __CPROVER_loop_entry(n))

with

  __CPROVER_loop_invariant(__CPROVER_POINTER_OFFSET(sc1) <= __CPROVER_POINTER_OFFSET(s1) + __CPROVER_loop_entry(n))
  __CPROVER_loop_invariant(__CPROVER_POINTER_OFFSET(sc2) <= __CPROVER_POINTER_OFFSET(s2) + __CPROVER_loop_entry(n))

will resolve the failures.

The failures doesn't appear in the CBMC regression tests because we run CBMC with --no-standard-checks, which disabled the pointer checks. I will fix the regression test.

QinyuanWu commented 3 months ago

Thank you for the insight. To confirm my understanding, is it correct that sc1 is havocked to a random value before assuming the bound and that's why CBMC detected the failure? And __CPROVER_POINTER_OFFSET can work here because it always returns a deterministic absolute value of the offset?

qinheping commented 3 months ago

The GOTO program after applying loop contracts will look like

assert( sc1 < bound) // check for the base case
havoc(sc1)
// here sc1 can be any value
assume(sc1 < bound)  // inductive assumption
// here sc1 can be only less than bound

By default, CBMC will instrument the pointer checks to check if pointers can be out of the bound in pointer arithmetic (plus, minus, comparison between pointers). So after the instrumentation, the GOTO program will be

assert( sc1 < bound) // check for the base case
havoc(sc1)
// here sc1 can be any value
assert(sc1 < bound)  // instrumented pointer check
assume(sc1 < bound)  // inductive assumption
// here sc1 can be only less than bound

Since the instrumented checks are before the assumption, sc1 can be any value and hence violate the check.

Using __CPROVER_POINTER_OFFSET will make the predicate __CPROVER_loop_invariant(__CPROVER_POINTER_OFFSET(sc1) <= __CPROVER_POINTER_OFFSET(s1) + __CPROVER_loop_entry(n)) contain no pointer arithmetic, and hence not trigger the pointer checks instrumentation---we don't check if sc1 is in the bound for such predicate.