move-language / move

Apache License 2.0
2.24k stars 675 forks source link

[Bug][move-prover] Undefined behavior provides issues with baseline tests #452

Open gregnazario opened 1 year ago

gregnazario commented 1 year ago

🐛 Bug

When running baseline tests locally, the loop tests provide a different output than the build servers. This causes there to be a lot of wasted time trying to figure out why these tests fail due to code changes, when in fact it has nothing to do with them.

To reproduce

Code snippet to reproduce

UB=1 cargo test

Stack trace/error message

https://github.com/move-language/move/runs/8222133907?check_suite_focus=true#step:6:3678

test prover unit[default]::functional/loops.move ... FAILED
Error: 
New output differs from baseline!
Call this test with env variable UPBL=1 to regenerate or remove old baseline files.
Then use your favorite changelist diff tool to verify you are good with the changes.

Or check the rudimentary diff below:

= Move prover returns: exiting with verification errors
= ... (101 lines)
=     =     at tests/sources/functional/loops.move:191: loop_with_two_back_edges_incorrect
-     =     at tests/sources/functional/loops.move:192: loop_with_two_back_edges_incorrect
-     =         y = <redacted>
-     =     at tests/sources/functional/loops.move:193: loop_with_two_back_edges_incorrect
+     =     at tests/sources/functional/loops.move:195: loop_with_two_back_edges_incorrect
+     =     at tests/sources/functional/loops.move:196: loop_with_two_back_edges_incorrect
+     =         x = <redacted>
+     =     at tests/sources/functional/loops.move:197: loop_with_two_back_edges_incorrect
=     =     at tests/sources/functional/loops.move:189: loop_with_two_back_edges_incorrect
= ... (64 lines)
=     =     at tests/sources/functional/loops.move:119: nested_loop_outer_invariant_incorrect
test prover unit[default]::functional/invariants_resources.move ... ok

Expected Behavior

The prover needs to be predictable or:

  1. Tests need to not be this specific in baselining
  2. These tests are changed accordingly to provide better signal

System information

Please complete the following information:

Additional context

Add any other context about the problem here.

wrwg commented 1 year ago

Thanks for reporting. There seems to be some new non-determinism upstream causing this, because we have already multiple measures in place to ensure output is deterministic.

You can work around this for now by adding pragma verify = false; to the spec loop_with_two_back_edges_incorrect block. Please also point via a TODO at the pragma to this issue.

junkil-park commented 1 year ago

Yeah, the test case can have two ways to fail for the inductive case. One is by assigning x, and another is assigning y. Perhaps, we need to update the test case so that its counterexample can be deterministic.

public fun loop_with_two_back_edges_incorrect(x: u64, y: u64) {
        spec {
            assume x != y;
        };
        loop {
            spec {
                invariant x != y;
            };
            if (x > y) {
                y = y + 1;
                continue
            };
            if (y > x) {
                x = x + 1;
                continue
            };
            break
        };
    }

@gregnazario can work around the issue by the way @wrwg describes.