model-checking / kani

Kani Rust Verifier
https://model-checking.github.io/kani
Apache License 2.0
2.24k stars 93 forks source link

RFC: New default output for Kani #1690

Open tedinski opened 2 years ago

tedinski commented 2 years ago

This issue is a task to write/propose an RFC for a new default output format for Kani.

There are a number of needs here:

  1. 1194

  2. 1653 (closed but has good discussion)

  3. 598

  4. 1675

  5. 1460

  6. We should give better output for unbounded unwinding that the raw CBMC output.
  7. We need to think about the output for a parallel harness runner. (e.g. unwinding needs a solution because we currently rely on printing CBMC output as it comes, which only works while we run one harness at a time. But we should also think about what status information the runner should give. Perhaps a gradle-style "here's what running on each thread" in the console with output appearing above it, for instance.)
  8. https://github.com/model-checking/kani-github-action/issues/10
  9. https://github.com/model-checking/kani/issues/1713

Being able to introduce a parallel runner is my focus, but the rest should be answered/solved through this RFC as well.

The goal should be (I think:)

  1. Default output
  2. Verbose output
  3. Quiet output (probably nothing: just exit status)
  4. Plus: possible differences for being interactive (terminal) versus batch (redirected to file / CI). (Colors + live status info as considerations)

I think I will designate as out-of-scope:

  1. 942

There are likely a lot of decisions to be made here. I'm not even going to start the list of questions yet.

tedinski commented 2 years ago

Some notes on inspiration for a new default output:

cargo test looks like this:

running 28 tests
test args::tests::check_abs_type ... ok
test args_toml::tests::check_merge_args_with_only_config_kani_args ... ok
[..snip..]
test call_cbmc::tests::check_resolve_unwind_value ... ok

test result: ok. 28 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.01s

cargo build gives a status like this:

   Compiling indexmap v1.9.1
   Compiling rayon v1.5.3
   [..snip..]
   Compiling object v0.29.0
    Building [===================>     ] 132/165: clap, object, regex, num-bigint, syn, rayon

This nicely shows parallelism (clap, object, regex, num-bigint, syn, rayon) on just one line. Meanwhile, you can scroll the output above it.

I think a hybrid might be nice, augmented with more statistics

   Compiling crate v0.0.0
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
    Verified crate::harness ... ok, 0.95s, 355 props
    Verified crate::other_harness ... ok, 3.42s, 1660 props 
   Verifying [===================>     ] 2/4: crate::mod::harness, crate::mod::more_harness
tedinski commented 2 years ago

Just looked at cargo-nextest's output, here's an excerpt (of batch output, not interactive terminal):

    Starting 119 tests across 11 binaries
...
        LEAK [   0.107s]                      nextest-tests::basic test_subprocess_doesnt_exit
        PASS [   0.008s]          nextest-tests::bin/nextest-tests tests::bin_success
        PASS [   0.004s]                  nextest-tests::bin/other tests::other_bin_success
        PASS [   0.014s]                      nextest-tests::other other_test_success
        PASS [   0.007s]      nextest-tests::example/nextest-tests tests::example_success
        PASS [   0.005s]              nextest-tests::example/other tests::other_example_success
     SIGSEGV [   0.552s]                   nextest-tests::segfault test_segfault
[..SNIP failure output..]
------------
     Summary [  73.845s] 119 tests run: 115 passed, 4 failed, 0 skipped
        FAIL [   7.640s]               cargo-nextest tests_integration::test_relocated_run
        FAIL [   6.782s]               cargo-nextest tests_integration::test_run
        FAIL [   6.881s]               cargo-nextest tests_integration::test_run_after_build
        FAIL [   6.953s]               cargo-nextest tests_integration::test_run_from_archive
tedinski commented 2 years ago

When a run fails, we should ensure we conclude with a command that would re-run just the failing harnesses. This allows both a tighter loop locally, and would allow people to go from a CI failure to reproducing that failure locally easily.

celinval commented 2 years ago

I created #1899 to track consistent info / warn /error messages which I believe is related to this.