diffblue / cbmc

C Bounded Model Checker
https://diffblue.github.io/cbmc
Other
847 stars 262 forks source link

CBMC 5.84 unsound in presence of undefined behaviour #7732

Closed rod-chapman closed 4 months ago

rod-chapman commented 1 year ago

I find CBMC 5.84 reports an unsound result for the following test case:

#include <stdio.h>

#ifdef CBMC
#else
#define __CPROVER_assert(...)
#endif

int main(void) {
  volatile int x = 0;
  int y;

  // This statement exhibits both unspecified behaviour on evalation order,
  // and also Undefined Behaviour, owing to more than one side-effect to
  // a single object before a single sequence point
  y = (x = 2) + (x = 1);

  __CPROVER_assert(y == 0, "Oops0..."); // FAILURE
  __CPROVER_assert(y == 1, "Oops1..."); // FAILURE
  __CPROVER_assert(y == 2, "Oops2..."); // SUCCESS ?!?!?
  __CPROVER_assert(y == 3, "Oops3..."); // FAILURE
  printf ("y is %d\n", y); // Prints "y is 3" with clang 14.0.3

}

An attempt to verify shows that CBMC thinks that the final value of y is 2:

rodchap@f4d4889dcf6d try % cbmc --version
5.84.0 (cbmc-5.84.0)
rodchap@f4d4889dcf6d try % cbmc -DCBMC rogu.c
CBMC version 5.84.0 (cbmc-5.84.0) 64-bit arm64 macos
Parsing rogu.c
Converting
Type-checking rogu
Generating GOTO Program
Adding CPROVER library (arm64)
Removal of function pointers and virtual functions
Generic Property Instrumentation
Running with 8 object bits, 56 offset bits (default)
Starting Bounded Model Checking
Runtime Symex: 0.000527709s
size of program expression: 44 steps
simple slicing removed 21 assignments
Generated 4 VCC(s), 3 remaining after simplification
Runtime Postprocess Equation: 5.792e-06s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 5.75e-05s
Running propositional reduction
Post-processing
Runtime Post-process: 2.5e-06s
Solving with MiniSAT 2.2.1 with simplifier
64 variables, 0 clauses
SAT checker: instance is SATISFIABLE
Runtime Solver: 1.5708e-05s
Runtime decision procedure: 8.6625e-05s

** Results:
rogu.c function main
[main.assertion.1] line 17 Oops0...: FAILURE
[main.assertion.2] line 18 Oops1...: FAILURE
[main.assertion.3] line 19 Oops2...: SUCCESS
[main.assertion.4] line 20 Oops3...: FAILURE

** 3 of 4 failed (2 iterations)
VERIFICATION FAILED

Compling with clang 14.0.3 (on Apple M1 macOS 13.3.1):

rodchap@f4d4889dcf6d try % gcc --version     
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.4.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
rodchap@f4d4889dcf6d try % gcc -o rogu rogu.c
rogu.c:15:10: warning: multiple unsequenced modifications to 'x' [-Wunsequenced]
  y = (x = 2) + (x = 1);
         ^         ~
1 warning generated.
rodchap@f4d4889dcf6d try % ./rogu
y is 3

I would hope that CBMC would detect the undefined behaviour (as clang does) and reject the program.

tautschnig commented 1 year ago

I agree that we should aim to expand the coverage of undefined behaviour. That said:

I am not sure what it would mean to "sound" in presence of undefined behaviour? CBMC does not, and does not claim to, detect all kinds of undefined behaviour. CBMC is able to synthesize assertions for some classes of undefined behaviour, and CBMC seeks to be sound for all input programs that do not exhibit undefined behaviour.

rod-chapman commented 1 year ago

"Sound in the presence of undefined behaviour" is a meaningless claim I think. If CBMC does not detect and prevent all classes of UB, where is that documented? My copy of WG14's n1570 document (roughly "C11") Annex J.2 lists 203 cases of UB. Could you document exactly which of these are prevented by CBMC and which ones I still have to be worried about?

kroening commented 1 year ago

We should definitively avoid any claims for programs with undefined behaviour. We should strive to extend our checks for UBs, with the ultimate goal being that we cover them all.

@tautschnig: This one is easy when done in/near the C front-end, but hard when done late on the goto-program. This really needs to get done, and I'll start looking into it.

rod-chapman commented 1 year ago

@kroening - in my experience, all C programs are littered with UB (every single signed integer operator has potential for UB for a start...), and just one instance of UB in a program is sufficient to nuke every other proof.

I can imagine CBMC having a "strict" mode where it really does get rid of all UB, either by subsetting or analysis. The user has to pay the price of some programs being rejected that lie outside of the analysable subset. This is how it works in Frama-C, SPARK, and probably ASTREE.

Please start by documenting exactly where we are now - a table of all 203+ cases would suffice.

For this particular instance, I am encouraged that clang manages to emit a warning. I am not sure, though, if clang warns for all such instances, or only some of them...

rod-chapman commented 1 year ago

Can someone add the "Soundness" label to this issue please?

TGWDB commented 1 year ago

Can someone add the "Soundness" label to this issue please?

I've added the soundness label. That said, I agree with the above sentiments that it's hard to define "sound" and "unsound" for a behaviour that is "undefined". From the above it appears that a "fix"[?] for this "bug" may be to at least document (if not also warn) on undefined behaviours and how CBMC handles them. You mentioned a list of 203+, I don't suppose you have that list on hand or a link to it?

rod-chapman commented 1 year ago

For me, any undefined behaviour always risks unsoundness. A correct (and possibly only) "fix" is to reject any and all UB. The problem is coming up with fast (and also sound!) algorithms to detect the UB in the first place without onerously subsetting the language in a way that just annoys users. The list of 200+ UB issues is in the ISO C standard (Appendix J last time I looked...)

rod-chapman commented 1 year ago

Yeah... C standards are on the WG14 document log. For example https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3054.pdf is the current draft of the C'23 standard. See Annex J.2

kroening commented 1 year ago

Please take a look at https://github.com/diffblue/cbmc/pull/7880 -- this is an attempt to track down all UB (according to C11) systematically.

kroening commented 1 year ago

The goal would be to have checks for all dynamic UB (I'd probably declare the preprocessor-related ones outside of the scope of CBMC).

kroening commented 8 months ago

Note #8226.

tautschnig commented 4 months ago

Closing as with CBMC 6 we have all supported undefined-behaviour checks enabled by default. We still have more work to do as documented in doc/C/c11-undefined-behavior.html.

rod-chapman commented 4 months ago

I am not convinced. "All supported UB checks" is a cop-out. What you're really saying is "the tool is still unsound for a bunch of things are are't supported." For this test case in particular: https://github.com/rod-chapman/cbmc-examples/blob/main/ub/ub1.c CBMC 6.0.1 is still demonstrably unsound, so I don't think you can claim victory just yet...

tautschnig commented 4 months ago

Happy to re-open this issue, but it will come pretty close to "won't fix": I don't see us rolling our own preprocessor anytime soon, which would be among the changes required to be able to tick all items on that list.

rod-chapman commented 4 months ago

Well... do you have an analysis of which ones can and should be fixed? I am happy that a great many are out of scope and/or solved by some very simple subsetting.