dafny-lang / dafny

Dafny is a verification-aware programming language
https://dafny.org
Other
2.84k stars 254 forks source link

Verification ignores resource limits #5525

Open hmijail opened 4 weeks ago

hmijail commented 4 weeks ago

Dafny version

4.6.0+da8b84f362bf482160ae2604e4ae6d3c62231dae

Code to produce this issue

See attached file, which is the result of "preprocessing" a file in my repo [1] with

dafny resolve WrappedEther.dfy/src/weth_0_name.dfy --print preprocessed.dfy --print-mode Everything 

This is just for convenience of reporting this issue; the same behavior happens in the original file. 

[1] https://github.com/Consensys/WrappedEther.dfy/blob/69552a4c87ab752fdc90474b86f4ddf3236acab5/src/weth_0_name.dfy#L400

Command to run and resulting output

dafny verify preprocessed.dfy --resource-limit 10000000 --progress --filter-symbol block_0_0x0128 --isolate-assertions

What happened?

Verification of the given file/symbol seems to ignore resource limits and takes a very long time to time out, if at all. This makes it difficult to work with that file. I thought this could be a z3 bug, so I tried various z3 versions, but it's hard to see a pattern.

With --isolate-assertions --progress, the problem seems to happen in assertion 43/43. The behavior changes with the resource limit argument given.

  1. --resource-limit 1000000 : assertion 43/43 verifies successfully
  2. --resource-limit 4000000 : assertion 43/43 takes minutes to fail This happens with all versions of z3 I tried, from 4.8.5 to 4.13.

Similar behavior happens without --isolate-assertions:

  1. --resource-limit 1000000 : verification runs out of resources in ~5 seconds
  2. --resource-limit 4000000 : verification keeps going for over 70 seconds, but the reported time and resource count seem disproportionally small
  3. --resource-limit 10000000 : verification keeps going for over 200 seconds, still with reported time and resource count too small

However, by changing to z3 versions other than the default 4.12.1 (e.g. 4.13 and 4.8.5), the 2nd and 3rd case finish in ~5 seconds.

preprocessed.dfy.zip

What type of operating system are you experiencing the problem on?

Mac

atomb commented 4 weeks ago

I've seen behavior like this before, and I believe it's a Z3 behavior. My understanding is that, while Z3 is executing, certain operations cause the resource counter to be incremented but that other potentially very expensive operations can happen in between increments. If one of those operations takes a very long time, it can throw things off quite a bit. I believe the same thing happens with time limits: Z3 doesn't interrupt itself when a timer goes off, but instead does synchronous checks at certain key points and exits before a goal has been solved if the current time exceeds the limit. Because of the potential for expensive operations between checks, it may exceed the limit by a lot.

We've considered adding checks within Dafny that the reported time or resource count are actually below the specified limit, and marking things as timeouts if they aren't, though that has its own downsides (e.g., if something successfully proved, but took a little too long, maybe you'd still want to know that it's correct).

hmijail commented 4 weeks ago

My understanding is that, while Z3 is executing, certain operations cause the resource counter to be incremented but that other potentially very expensive operations can happen in between increments. If one of those operations takes a very long time, it can throw things off quite a bit.

That would make sense... but the time and resource costs reported at the end seem off. For example:

$  dafny verify preprocessed.dfy  --resource-limit 10000000 --progress --filter-symbol block_0_0x0128

...

Verified 0/1 symbols. Waiting for name.block_0_0x0128 to verify.
Verification part 1/2 of name.block_0_0x0128, on line 250, verified successfully (time: 0:00:00.2921759, resource count: 3529921)
Verification part 2/2 of name.block_0_0x0128, on line 250, could not prove all assertions (time: 0:03:31.5808713, resource count: 14056825)
...

(sidenote: dafny 4.6 had worse problems with the reporting, which seem fixed in the nightly: the time it reported for part 2 was less than 1 sec, even though stdout took the same 210 sec as the nightly)

Related questions:

Right now what I do is kill dafny if it runs for too long, but this is hard to automate because it's hard to say what is "too long", plus the log file ends up empty. So I retry with other random seed or z3 version until one run manages to finish by itself in a "reasonable" time. The problem being how to know what is a reasonable time, of course; so this is a subjective and manually intensive process. Is there a better option?

hmijail commented 3 weeks ago

I managed to isolate the combination of dafny args that make z3 run "forever", and extracted the z3 input with --solver-log.

So this is now reproducible with z3-4.13.0/bin/z3 1717648676-Impl__name.__default.block__0__0x0128_split20.smt rlimit=6900000

In contrast, reducing slightly the rlimit to 6700000 makes z3 finish in less than 1 sec. This also applies to previous z3 versions.

z3-4.12.1/bin/z3 1717648676-Impl__name.__default.block__0__0x0128_split20.smt rlimit=6700000
unknown
(:reason-unknown "canceled")
(:rlimit 6722629)

Is this useful? ... probably I should rather report this to the z3 people? (either as bug report or for advice on how to stop z3 when rlimit won't)

1717648676-Implname.default.block00x0128_split20.smt.zip

hmijail commented 3 weeks ago

I opened the corresponding Z3 issue here: https://github.com/Z3Prover/z3/issues/7247

hmijail commented 2 weeks ago

The z3 issue has been fixed, and when I use that z3 with a dafny nightly, the problem I reported in this issue is gone.

keyboardDrummer commented 2 weeks ago

Reopening because we should let Dafny use the upcoming Z3 with that fix, to resolve this problem on the Dafny side.