Closed zhassan-aws closed 2 years ago
With 1eb9643d8e98c97dffbfc3f73dfa02abd68974ba, verification completes successfully on this harness with the MIR linker enabled.
This behavior is introduced in 5539c0de6e771bd10c2b692cd6d62f0abba98144. In the previous commit (f3b5a9777c535f386949cfabf55db820b73837e0), the run completes in a few seconds with the MIR linker.
Is it possible that the model is just too big for cbmc to handle? Have you tried setting the unwind loop to 1?
Yes, with an unwind of 1, it finishes quickly, and an unwinding assertion fails:
$ cargo kani --tests --harness round_trip --unwind 1 --enable-unstable --mir-linker
<snip>
SUMMARY:
** 1 of 4507 failed (4506 undetermined)
Failed Checks: unwinding assertion loop 1
File: "/home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/range.rs", line 815, in <(std::collections::Bound<&u8>, std::collections::Bound<&u8>) as std::ops::RangeBounds<u8>>::contains::<u8>
VERIFICATION:- FAILED
[Kani] info: Verification output shows one or more unwinding failures.
[Kani] tip: Consider increasing the unwinding value or disabling `--unwinding-assertions`.
Verification Time: 2.66511s
Any unwind value above 1 causes it to unwind forever.
Here's an independent test that reproduces the same behavior:
pub enum Error {
Err1(usize),
Err2,
}
use std::fmt;
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Self::Err1(len) => write!(f, "err1: {}", len),
Self::Err2 => write!(f, "err2"),
}
}
}
fn encode(x: i32) -> Result<Vec<u8>, ()> {
if x > 10 {
Ok(vec![1])
} else {
Err(())
}
}
fn decode_inner(v: &Vec<u8>) -> Result<(), Error> {
let x = match v.len() {
1 => Err(Error::Err1(3)),
2 => Err(Error::Err2),
_ => Ok(()),
};
x
}
fn decode(v: Vec<u8>) -> Result<(), String> {
decode_inner(&v).map_err(|err| err.to_string())?;
Ok(())
}
#[kani::proof]
#[kani::unwind(2)]
fn main() {
let v = encode(kani::any()).unwrap();
let _ = decode(v);
}
$ kani err_vec.rs --enable-unstable --mir-linker
Unwinding loop _ZN4core10intrinsics23is_aligned_and_not_null17h6eae2af2c254ed47E.1 iteration 1 file /home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs line 2204 column 5 function std::intrinsics::is_aligned_and_not_null::<u8> thread 0
Unwinding loop _ZN4core10intrinsics23is_aligned_and_not_null17h6eae2af2c254ed47E.2 iteration 1 file /home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs line 2204 column 5 function std::intrinsics::is_aligned_and_not_null::<u8> thread 0
Unwinding loop _RINvNvNtCs4IvNUtTnEFD_4core10intrinsics19copy_nonoverlapping7runtimehECs7nqv5Ecv62r_7err_vec.4 iteration 1 file /home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs line 2329 column 13 function std::intrinsics::copy_nonoverlapping::runtime::<u8> thread 0
Not unwinding loop _RINvNvNtCs4IvNUtTnEFD_4core10intrinsics19copy_nonoverlapping7runtimehECs7nqv5Ecv62r_7err_vec.0 iteration 2 file /home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs line 2329 column 13 function std::intrinsics::copy_nonoverlapping::runtime::<u8> thread 0
Unwinding loop _RINvNvNtCs4IvNUtTnEFD_4core10intrinsics19copy_nonoverlapping7runtimehECs7nqv5Ecv62r_7err_vec.1 iteration 1 file /home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs line 2329 column 13 function std::intrinsics::copy_nonoverlapping::runtime::<u8> thread 0
Unwinding loop _RINvNvNtCs4IvNUtTnEFD_4core10intrinsics19copy_nonoverlapping7runtimehECs7nqv5Ecv62r_7err_vec.5 iteration 1 file /home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs line 2329 column 13 function std::intrinsics::copy_nonoverlapping::runtime::<u8> thread 0
aborting path on assume(false) at file /home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs line 2328 column 9 function std::intrinsics::copy_nonoverlapping::runtime::<u8> thread 0
aborting path on assume(false) at file /home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs line 853 column 15 function std::result::Result::<std::ptr::NonNull<[u8]>, std::alloc::AllocError>::map_err::<std::collections::TryReserveError, [closure@alloc::raw_vec::finish_grow<std::alloc::Global>::{closure#1}]> thread 0
Unwinding loop _ZN4core6result19Result$LT$T$C$E$GT$7map_err17h0793994c3d33a0c8E.0 iteration 1 file /home/ubuntu/.rustup/toolchains/nightly-2022-09-13-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs line 857 column 5 function std::result::Result::<std::ptr::NonNull<[u8]>, std::alloc::AllocError>::map_err::<std::collections::TryReserveError, [closure@alloc::raw_vec::finish_grow<std::alloc::Global>::{closure#1}]> thread 0
...
Can also be reproduced without Vec
:
pub enum Error {
Err1(usize),
Err2,
}
use std::fmt;
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Self::Err1(len) => write!(f, "{}", len),
Self::Err2 => write!(f, ""),
}
}
}
fn encode(x: i32) -> Result<[u8; 1], ()> {
if x > 10 {
Ok([1])
} else {
Err(())
}
}
fn decode_inner(v: &[u8]) -> Result<(), Error> {
let x = match v.len() {
1 => Err(Error::Err1(3)),
2 => Err(Error::Err2),
_ => Ok(()),
};
x
}
fn decode(v: [u8; 1]) -> Result<(), String> {
decode_inner(&v).map_err(|err| err.to_string())?;
Ok(())
}
#[kani::proof]
#[kani::unwind(2)]
fn main() {
let v = encode(kani::any()).unwrap();
let _ = decode(v);
}
Does this involve function pointers of any kind? Do you perhaps end up with an extremely broad computation tree?
Possibly, through the fmt::Display
trait. Is there a way to check in the goto binary/symbol table file?
If you do goto-instrument --show-goto-functions
on the final goto binary you should get to witness this, but also --gen-c
would show a large number of function calls.
Shrinked even further and removed all non-determinism:
pub enum Error {
Err1(usize),
}
use std::fmt;
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Self::Err1(len) => write!(f, "{}", len),
}
}
}
fn decode_inner() -> Result<(), Error> {
Err(Error::Err1(3))
}
fn decode() -> Result<(), String> {
decode_inner().map_err(|err| err.to_string())?;
Ok(())
}
#[kani::proof]
#[kani::unwind(2)]
fn main() {
let _ = decode();
}
Here's the output of goto-instrument --show-goto-functions
on the shrinked test:
goto-functions.txt
The Display trait is implemented everywhere in the std. I'm not surprised that blows up. Maybe --restrict-vtable
can help
--restrict-vtable
didn't make any difference.
BTW, I'm using the following example:
pub struct Error(usize);
use std::fmt;
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", self.0)
}
}
#[kani::proof]
#[kani::unwind(2)]
fn main() {
let s = Error(0).to_string();
assert!(s.len() > 0);
}
@tautschnig Does cbmc have any nicer way to debug loop unwinding such as a graphical representation of the call graph?
This is the output from goto-instrument --dump-c
on the reduced testcase (had to use CBMC latest to include the fix for https://github.com/diffblue/cbmc/issues/7158):
@celinval tracked this down to the formatting code (specifically the write_fmt
function called by the write!
macro).
@camshaft, a workaround is to replace the fmt
code under kani:
diff --git a/common/s2n-codec/src/decoder/mod.rs b/common/s2n-codec/src/decoder/mod.rs
index de45b163..45f8dbe1 100644
--- a/common/s2n-codec/src/decoder/mod.rs
+++ b/common/s2n-codec/src/decoder/mod.rs
@@ -424,14 +424,18 @@ pub enum DecoderError {
use core::fmt;
impl fmt::Display for DecoderError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
- match self {
- Self::UnexpectedEof(len) => write!(f, "unexpected eof: {}", len),
- Self::UnexpectedBytes(len) => write!(f, "unexpected bytes: {}", len),
- Self::LengthCapacityExceeded => write!(
- f,
- "length could not be represented in platform's usize type"
- ),
- Self::InvariantViolation(msg) => write!(f, "{}", msg),
+ if cfg!(kani) {
+ Ok(())
+ } else {
+ match self {
+ Self::UnexpectedEof(len) => write!(f, "unexpected eof: {}", len),
+ Self::UnexpectedBytes(len) => write!(f, "unexpected bytes: {}", len),
+ Self::LengthCapacityExceeded => write!(
+ f,
+ "length could not be represented in platform's usize type"
+ ),
+ Self::InvariantViolation(msg) => write!(f, "{}", msg),
+ }
}
}
}
This makes the round_trip
proof go through quickly.
Here's another minimal reproducer:
use std::fmt::Write;
#[kani::proof]
#[kani::unwind(2)]
fn main() {
let mut s = String::new();
let _ = write!(&mut s, "{}", 5);
}
@tautschnig Does cbmc have any nicer way to debug loop unwinding such as a graphical representation of the call graph?
You can use goto-instrument --dot
to obtain a diagram (but I guess on this occasion you've managed to track down the problem already?).
but I guess on this occasion you've managed to track down the problem already?
Not really. We only know it's related to the formatting code, but not sure exactly why the unwinding is not terminating.
Yet another minimal reproducer:
#[kani::proof]
#[kani::unwind(2)]
fn main() {
let a = std::fmt::format(format_args!("{}", 5));
}
I've extracted the relevant code from std, and created a test that pretty much doesn't use any of the formatting code in std
, but still reproduces the unbounded unwinding behavior:
pub trait Trait {
fn f(&self, c: char);
}
struct Foo {}
impl Trait for Foo {
fn f(&self, c: char) {
let _ = c.encode_utf8(&mut [0; 4]);
}
}
pub struct Formatter {}
fn nn(_x: &u8, _f: &Formatter) {}
pub struct ArgumentV1<'a> {
value: &'a u8,
formatter: fn(&u8, &Formatter) -> (),
}
impl<'a> ArgumentV1<'a> {
pub fn new<'b>(x: &u8, f: fn(&u8, &Formatter) -> ()) -> ArgumentV1<'b> {
unsafe {
ArgumentV1 {
formatter: std::mem::transmute(f),
value: std::mem::transmute(x),
}
}
}
}
pub struct Arguments<'a> {
args: &'a [ArgumentV1<'a>],
}
impl<'a> Arguments<'a> {
pub const fn new_v1(args: &'a [ArgumentV1<'a>]) -> Arguments<'a> {
Arguments { args }
}
}
pub fn write(_output: &mut dyn Trait, args: Arguments<'_>) {
let formatter = Formatter {};
for arg in args.args {
let _ = (arg.formatter)(arg.value, &formatter);
}
}
#[cfg_attr(kani, kani::proof, kani::unwind(2))]
fn main() {
let mut f = Foo {};
let a = [ArgumentV1::new(&5, nn)];
let args = Arguments::new_v1(&a);
write(&mut f, args);
}
Reduced the above test further to:
pub trait Trait {
fn f(&self);
}
struct Foo {}
impl Trait for Foo {
fn f(&self) {
let _ = 'x'.encode_utf8(&mut [0; 4]);
}
}
pub struct Formatter {}
fn nn(_x: &u8, _f: &Formatter) {}
pub struct ArgumentV1 {
formatter: fn(&u8, &Formatter) -> (),
}
impl ArgumentV1 {
pub fn new() -> ArgumentV1 {
unsafe {
ArgumentV1 {
formatter: std::mem::transmute(nn as fn(_, _)),
}
}
}
}
#[cfg_attr(kani, kani::proof, kani::unwind(2))]
fn main() {
let f = Foo {};
let a = [ArgumentV1::new()];
let _output = &f as &dyn Trait;
let formatter = Formatter {};
for arg in &a {
let _ = (arg.formatter)(&5, &formatter);
}
}
CBMC stays in the unrolling phase for more than 10 minutes without terminating.
Function pointer removal as taking place in goto-instrument
will replace the arg.formatter
pointer by a number of candidate functions. The set of "candidate" functions is determined by looking at matching (with very coarse over-approximation) signatures. See also https://diffblue.github.io/cbmc/man/goto-instrument.html, search for the documentation of --remove-function-pointers
. With --mir-linker
, the pool of candidate functions is much larger. The problem then really is the functions called from these candidate functions. In many cases, the constant propagation during symbolic execution may alleviate this problem when the actual value of the function pointer can be precisely determined. It seems that's not the case here, AFAIK owing to the loop iteration (actually whatever iterators do under the hood). It maybe be worth diving further why we can't constant-propagate this here. So you end up with a number of functions being called when it should really just be nn
(if I understand the Rust code correctly). You might have to use --restrict-vtable
on this example.
@tautschnig and I managed to reduce it further to:
pub trait Trait {
fn f(&self);
}
struct Foo {}
impl Trait for Foo {
fn f(&self) {
let _ = 'x'.encode_utf8(&mut [0; 4]);
}
}
pub struct Formatter {}
fn nn(_x: &u8, _f: &Formatter) {}
pub struct ArgumentV1 {
formatter: fn(&u8, &Formatter) -> (),
}
#[cfg_attr(kani, kani::proof, kani::unwind(2))]
fn main() {
let f = Foo {};
let a = [ArgumentV1 { formatter: nn, }];
let _output = &f as &dyn Trait;
let formatter = Formatter {};
let mut iter = a.iter();
let _ = (iter.next().unwrap().formatter)(&5, &formatter);
}
https://github.com/diffblue/cbmc/pull/7243 fixes this particular example as constant propagation can then figure out during symbolic execution that nn
is the only possible value for the function pointer.
(Some more information from the analysis of the above example with @zhassan-aws: let _output = &f as &dyn Trait;
adds f
to the candidate list of functions being considered by CBMC's function pointer removal as f
's address is taken at that point. This, in turn, will result in encode_utf8
being symbolically executed, and that function just appears to be large (or itself be calling many other functions). Also, it seemed that it was required to both take a slice and use an iterator to take constant propagation off track.)
That's awesome! Thanks for investigating and figuring out a fix @tautschnig!
One thing that is weird though is that encode_utf8
does not seem to be an issue in itself. For example, this small program:
fn foo(c: char, buf: &mut [u8]) {
c.encode_utf8(buf);
}
#[kani::proof]
fn main() {
let mut buf = [0; 4];
foo('x', &mut buf);
assert_eq!(buf, [120, 0, 0, 0]);
}
is verified in a fraction of a second:
$ kani encode.rs --enable-unstable --mir-linker
<snip>
SUMMARY:
** 0 of 91 failed (7 unreachable)
VERIFICATION:- SUCCESSFUL
Verification Time: 0.17814663s
Even if I hide the function behind dyn trait
to mimic the other program, it still finishes quickly:
trait T {
fn foo(&mut self, c: char, buf: &mut [u8]);
}
struct Foo {}
impl T for Foo {
fn foo(&mut self, c: char, buf: &mut [u8]) {
c.encode_utf8(buf);
}
}
fn bar(x: &mut dyn T) {
let mut buf = [0; 4];
x.foo('x', &mut buf);
assert_eq!(buf, [120, 0, 0, 0]);
}
#[kani::proof]
#[kani::unwind(5)]
fn main() {
let mut f = Foo {};
bar(&mut f);
}
$ kani encode2.rs --enable-unstable --mir-linker
<snip>
SUMMARY:
** 0 of 104 failed (7 unreachable)
VERIFICATION:- SUCCESSFUL
Verification Time: 0.18649799s
So I'm not sure why it's causing issues in the other program.
Following up on https://github.com/model-checking/kani/issues/1767#issuecomment-1279423415: it seems that working through the Formatter
with perhaps a lack of constant propagation is what takes forever. Logs kindly shared by @zhassan-aws from an older version of CBMC that didn't have the necessary improvements include lines like:
Unwinding recursion <std::fmt::Arguments<'_> as std::fmt::Display>::fmt iteration 1
and almost all the instructions symbolically executed stemming from either lib/rustlib/src/rust/library/core/src/fmt/mod.rs
or lib/rustlib/src/rust/library/core/src/fmt/num.rs
.
One of the harnesses in s2n-quic (
round_trip
) that was previously solved fairly quickly now unwinds forever.Steps to reproduce
git clone https://github.com/aws/s2n-quic
cd s2n-quic/quic/s2n-quic-core
git checkout 80e5a554fea9f5ca5ed9befa8012170eaecfdd62
cargo kani --tests --harness round_trip --unwind 5 --enable-unstable --mir-linker
CBMC keeps unwinding for more than 2 minutes and never gets to the solving stage:
Kani Version: af6c93eb74520c9d7eeda3b7e7f28220d617b9cf
If I don't enable the MIR linker, verification completes successfully: