Open thpani opened 2 years ago
When calling Z3 through the API, it's internal statistics (z3solver.getStatistics
) show non-deterministic behavior in both :time
and :restarts
.
On 5 Apalache runs, Z3 :restarts
range from 26 to 41. :time
for the second-to-last SMT call ranges from 327 to 1811 seconds:
apalache run 1 | apalache run 2 | apalache run 3 | apalache run 4 | apalache run 5 |
---|
:time sat call 1 | 144.80 | 143.26 | 143.26 | 144.91 | 142.78 sat call 2 | 8.24 | 8.29 | 8.29 | 8.37 | 8.21 sat call 3 | 73.59 | 42.20 | 42.20 | 42.36 | 71.57 sat call 4 | 2.91 | 1.80 | 1.80 | 1.83 | 2.90 sat call 5 | 2.13 | 1.30 | 1.30 | 1.92 | 91.86 sat call 6 | 1811.14 | 417.34 | 417.34 | 327.39 | 784.24 sat call 7 | 12.74 | 10.71 | 10.71 | 7.41 | 8.93 :restarts sat call 1 | 10 | 10 | 10 | 10 | 10 sat call 2 | 10 | 10 | 10 | 10 | 10 sat call 3 | 14 | 13 | 13 | 13 | 14 sat call 4 | 15 | 13 | 13 | 13 | 15 sat call 5 | 16 | 13 | 13 | 13 | 17 sat call 6 | 29 | 25 | 25 | 25 | 40 sat call 7 | 29 | 26 | 26 | 26 | 41
OTOH, running Z3 from the shell on the dumped log0.smt
looks to be completely deterministic:
$ for i ($(seq 1 14)) ; do
z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 ~/log0.smt | egrep 'time|restarts'
done
:restarts 31
:time 11.35
:total-time 585.41)
:restarts 31
:time 11.16
:total-time 584.54)
:restarts 31
:time 11.10
:total-time 581.18)
:restarts 31
:time 11.08
:total-time 568.50)
:restarts 31
:time 11.10
:total-time 571.24)
z3 binary matches our z3-turnkey dependency version 4.8.17
.
Even the offline solver context suffers from this issue.
Also, looks like the random seeds are set by default, so I doubt that is the issue:
$ python3 -c "import z3; print(z3.get_param('smt.random_seed'))"
0
I verified that the SMT constraints produced (using Z3Solver.toString()
) are identical across different runs of Apalache.
Can it be that the memory layout of different runs affects some heuristics of z3? I guess, when you run it in CLI, you always get the same memory allocator. When z3 is run as a library inside JVM, malloc
is probably less deterministic.
I think it's down to tactics or something related.
AFAICT, the Z3 shell constructs the solver instance lazily only once it encounters a (sat)
, (push)
or (reset)
:
The API invokes the solver factory much earlier, as early as the first Z3_solver_assert()
:
Still present if we use mkSimpleSolver
instead of mkSolver
.
Seems like it's our constraint output after all! We sometimes permute disjunctions:
--- bench.smt 2022-08-31 08:51:17.000000000 +0200
+++ _apalache-out/Termination.tla/2022-09-02T15-49-04_14311296710629792029/log0.smt 2022-09-02 19:12:59.000000000 +0200
; [START] Caching equality constraints for a sequence: List(($C$14,$C$246), ($C$13,$C$246), ($C$12,$C$246), ($C$16,$C$246), ($C$11,$C$246), ($C$15,$C$246))
; [DONE] Caching equality constraints
-;; assert (IF (($C$243 = $C$215) ∨ ($C$243 = $C$209) ∨ ($C$243 = $C$233) ∨ ($C$243 = $C$239) ∨ ($C$243 = $C$221) ∨ ($C$243 = $C$227)) THEN ($C$246 = $C$14) ELSE TRUE) ∧ (IF (($C$243 = $C$220) ∨ ($C$243 = $C$208) ∨ ($C$243 = $C$214) ∨ ($C$243 = $C$226) ∨ ($C$243 = $C$238) ∨ ($C$243 = $C$232)) THEN ($C$246 = $C$13) ELSE TRUE) ∧ (IF (($C$243 = $C$225) ∨ ($C$243 = $C$219) ∨ ($C$243 = $C$231) ∨ ($C$243 = $C$207) ∨ ($C$243 = $C$213) ∨ ($C$243 = $C$237)) THEN ($C$246 = $C$12) ELSE TRUE) ∧ (IF (($C$243 = $C$235) ∨ ($C$243 = $C$241) ∨ ($C$243 = $C$223) ∨ ($C$243 = $C$229) ∨ ($C$243 = $C$217) ∨ ($C$243 = $C$211)) THEN ($C$246 = $C$16) ELSE TRUE) ∧ (IF (($C$243 = $C$206) ∨ ($C$243 = $C$236) ∨ ($C$243 = $C$230) ∨ ($C$243 = $C$218) ∨ ($C$243 = $C$212) ∨ ($C$243 = $C$224)) THEN ($C$246 = $C$11) ELSE TRUE) ∧ (IF (($C$243 = $C$240) ∨ ($C$243 = $C$216) ∨ ($C$243 = $C$222) ∨ ($C$243 = $C$228) ∨ ($C$243 = $C$210) ∨ ($C$243 = $C$234)) THEN ($C$246 = $C$15) ELSE TRUE)
-(assert (and (ite (or (= $C$243 $C$215)
+;; assert (IF (($C$243 = $C$233) ∨ ($C$243 = $C$209) ∨ ($C$243 = $C$239) ∨ ($C$243 = $C$215) ∨ ($C$243 = $C$227) ∨ ($C$243 = $C$221)) THEN ($C$246 = $C$14) ELSE TRUE) ∧ (IF (($C$243 = $C$232) ∨ ($C$243 = $C$220) ∨ ($C$243 = $C$208) ∨ ($C$243 = $C$226) ∨ ($C$243 = $C$238) ∨ ($C$243 = $C$214)) THEN ($C$246 = $C$13) ELSE TRUE) ∧ (IF (($C$243 = $C$237) ∨ ($C$243 = $C$219) ∨ ($C$243 = $C$207) ∨ ($C$243 = $C$213) ∨ ($C$243 = $C$225) ∨ ($C$243 = $C$231)) THEN ($C$246 = $C$12) ELSE TRUE) ∧ (IF (($C$243 = $C$217) ∨ ($C$243 = $C$229) ∨ ($C$243 = $C$241) ∨ ($C$243 = $C$235) ∨ ($C$243 = $C$223) ∨ ($C$243 = $C$211)) THEN ($C$246 = $C$16) ELSE TRUE) ∧ (IF (($C$243 = $C$218) ∨ ($C$243 = $C$212) ∨ ($C$243 = $C$236) ∨ ($C$243 = $C$224) ∨ ($C$243 = $C$230) ∨ ($C$243 = $C$206)) THEN ($C$246 = $C$11) ELSE TRUE) ∧ (IF (($C$243 = $C$228) ∨ ($C$243 = $C$222) ∨ ($C$243 = $C$240) ∨ ($C$243 = $C$216) ∨ ($C$243 = $C$234) ∨ ($C$243 = $C$210)) THEN ($C$246 = $C$15) ELSE TRUE)
+(assert (and (ite (or (= $C$243 $C$233)
(= $C$243 $C$209)
- (= $C$243 $C$233)
(= $C$243 $C$239)
- (= $C$243 $C$221)
- (= $C$243 $C$227))
+ (= $C$243 $C$215)
+ (= $C$243 $C$227)
+ (= $C$243 $C$221))
(= $C$246 $C$14)
true)
- (ite (or (= $C$243 $C$220)
+ (ite (or (= $C$243 $C$232)
+ (= $C$243 $C$220)
(= $C$243 $C$208)
- (= $C$243 $C$214)
(= $C$243 $C$226)
(= $C$243 $C$238)
- (= $C$243 $C$232))
+ (= $C$243 $C$214))
(= $C$246 $C$13)
true)
So far I've only seen this happen with Caching equality constraints for a sequence: List(...)
.
Wow. Good find!
Also, latest z3 does not seem to have this issue! (at least on this particular spec) I've requested a turnkey version bump here: https://github.com/tudo-aqua/z3-turnkey/issues/14
$ time z3-4.8.17-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 bench.smt
577.04s user 2.61s system 99% cpu 9:40.59 total
$ time z3-4.8.17-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 _apalache-out/Termination.tla/2022-09-02T15-49-04_14311296710629792029/log0.smt
2516.66s user 8.20s system 99% cpu 42:08.02 total # <---- !!!!
$ time z3-4.11.0-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 bench.smt
696.23s user 2.63s system 99% cpu 11:40.47 total
$ time z3-4.11.0-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 _apalache-out/Termination.tla/2022-09-02T15-49-04_14311296710629792029/log0.smt
510.98s user 2.01s system 99% cpu 8:34.06 total
This is amazing! We should probably learn from this how to prioritize constraints to improve performance!
Do you have a complete log? I think it is something right after the equality constraints were cached.
10:46 11:45 13:03 15:16 15:43 16:00 16:25 16:43 17:44 18:05 20:43 21:26 31:47 32:39
I will test whether it's fixed by just bumping Z3 as indicated in https://github.com/informalsystems/apalache/issues/2120#issuecomment-1235771144.
Still present with #2149 and Z3 bumped to the lastest release. I have a few remaining ideas to test:
Solver.toString()
to ensure they are not jumbled internally. ensured they're equivalent--debug
, which is IO heavy? nopez3 -pd
. still all over the placeFixing ZipOracle
(#2149) made the offline solver deterministic.
So the remaining issue must be in the incremental solver.
There's a few Z3 parameters for this in the combined_solver
module:
combined_solver.ignore_solver1=true
in offline mode. still deterministicQF_IDL
didn't help.I'm out of ideas, so I will leave this be for the moment. Maybe it can be fixed by using assumptions (#2137).
I thought #2569 might've fixed this, but it's still present.
When running @nano-o's spec for 6 processes, I noticed considerable fluctuations (up to 4x slowdown) in Apalache runtime.
Command:
Spec diff for 6 processes: