apalache-mc / apalache

APALACHE: symbolic model checker for TLA+ and Quint
https://apalache-mc.org/
Apache License 2.0
441 stars 40 forks source link

Non-deterministic solver runtimes #2120

Open thpani opened 2 years ago

thpani commented 2 years ago

When running @nano-o's spec for 6 processes, I noticed considerable fluctuations (up to 4x slowdown) in Apalache runtime.

Command:

apalache-mc check --run-dir=out --debug --init=Correctness_ --inv=Correctness --length=1 ~/src/Distributed-termination-detection/Termination.tla

Spec diff for 6 processes:

diff --git a/Termination.tla b/Termination.tla
index 5ebcf77..28f6a20 100644
--- a/Termination.tla
+++ b/Termination.tla
@@ -48,8 +48,8 @@ EXTENDS Integers, FiniteSets, Apalache, Sequences

 \* @type: Set(P);
 \* P == {"P1_OF_P", "P2_OF_P", "P3_OF_P"}
-P == {"P1_OF_P", "P2_OF_P", "P3_OF_P", "P4_OF_P"}
-\* P == {"P1_OF_P", "P2_OF_P", "P3_OF_P", "P4_OF_P", "P5_OF_P"}
+\* P == {"P1_OF_P", "P2_OF_P", "P3_OF_P", "P4_OF_P"}
+P == {"P1_OF_P", "P2_OF_P", "P3_OF_P", "P4_OF_P", "P5_OF_P", "P6_OF_P"}
thpani commented 2 years ago

When calling Z3 through the API, it's internal statistics (z3solver.getStatistics) show non-deterministic behavior in both :time and :restarts. On 5 Apalache runs, Z3 :restarts range from 26 to 41. :time for the second-to-last SMT call ranges from 327 to 1811 seconds:

  apalache run 1 apalache run 2 apalache run 3 apalache run 4 apalache run 5

:time sat call 1 | 144.80 | 143.26 | 143.26 | 144.91 | 142.78 sat call 2 | 8.24 | 8.29 | 8.29 | 8.37 | 8.21 sat call 3 | 73.59 | 42.20 | 42.20 | 42.36 | 71.57 sat call 4 | 2.91 | 1.80 | 1.80 | 1.83 | 2.90 sat call 5 | 2.13 | 1.30 | 1.30 | 1.92 | 91.86 sat call 6 | 1811.14 | 417.34 | 417.34 | 327.39 | 784.24 sat call 7 | 12.74 | 10.71 | 10.71 | 7.41 | 8.93 :restarts sat call 1 | 10 | 10 | 10 | 10 | 10 sat call 2 | 10 | 10 | 10 | 10 | 10 sat call 3 | 14 | 13 | 13 | 13 | 14 sat call 4 | 15 | 13 | 13 | 13 | 15 sat call 5 | 16 | 13 | 13 | 13 | 17 sat call 6 | 29 | 25 | 25 | 25 | 40 sat call 7 | 29 | 26 | 26 | 26 | 41

thpani commented 2 years ago

OTOH, running Z3 from the shell on the dumped log0.smt looks to be completely deterministic:

$ for i ($(seq 1 14)) ; do
  z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 ~/log0.smt | egrep 'time|restarts'
done

 :restarts                    31
 :time                        11.35
 :total-time                  585.41)
 :restarts                    31
 :time                        11.16
 :total-time                  584.54)
 :restarts                    31
 :time                        11.10
 :total-time                  581.18)
 :restarts                    31
 :time                        11.08
 :total-time                  568.50)
 :restarts                    31
 :time                        11.10
 :total-time                  571.24)

z3 binary matches our z3-turnkey dependency version 4.8.17.

thpani commented 2 years ago

Even the offline solver context suffers from this issue.

Also, looks like the random seeds are set by default, so I doubt that is the issue:

$ python3 -c "import z3; print(z3.get_param('smt.random_seed'))"
0
thpani commented 2 years ago

I verified that the SMT constraints produced (using Z3Solver.toString()) are identical across different runs of Apalache.

konnov commented 2 years ago

Can it be that the memory layout of different runs affects some heuristics of z3? I guess, when you run it in CLI, you always get the same memory allocator. When z3 is run as a library inside JVM, malloc is probably less deterministic.

thpani commented 2 years ago

I think it's down to tactics or something related. AFAICT, the Z3 shell constructs the solver instance lazily only once it encounters a (sat), (push) or (reset):

https://github.com/Z3Prover/z3/blob/e2f4fc23076032935daa85e305d0453dc1470d73/src/cmd_context/cmd_context.cpp#L787-L788

The API invokes the solver factory much earlier, as early as the first Z3_solver_assert():

https://github.com/Z3Prover/z3/blob/78eaefe5a8be7b24761ad829682c603865458a66/src/api/api_solver.cpp#L473-L477

https://github.com/Z3Prover/z3/blob/78eaefe5a8be7b24761ad829682c603865458a66/src/api/api_solver.cpp#L142-L147

thpani commented 2 years ago

Still present if we use mkSimpleSolver instead of mkSolver.

thpani commented 2 years ago

Seems like it's our constraint output after all! We sometimes permute disjunctions:

--- bench.smt   2022-08-31 08:51:17.000000000 +0200
+++ _apalache-out/Termination.tla/2022-09-02T15-49-04_14311296710629792029/log0.smt 2022-09-02 19:12:59.000000000 +0200
 ; [START] Caching equality constraints for a sequence: List(($C$14,$C$246), ($C$13,$C$246), ($C$12,$C$246), ($C$16,$C$246), ($C$11,$C$246), ($C$15,$C$246))
 ; [DONE]  Caching equality constraints
-;; assert (IF (($C$243 = $C$215) ∨ ($C$243 = $C$209) ∨ ($C$243 = $C$233) ∨ ($C$243 = $C$239) ∨ ($C$243 = $C$221) ∨ ($C$243 = $C$227)) THEN ($C$246 = $C$14) ELSE TRUE) ∧ (IF (($C$243 = $C$220) ∨ ($C$243 = $C$208) ∨ ($C$243 = $C$214) ∨ ($C$243 = $C$226) ∨ ($C$243 = $C$238) ∨ ($C$243 = $C$232)) THEN ($C$246 = $C$13) ELSE TRUE) ∧ (IF (($C$243 = $C$225) ∨ ($C$243 = $C$219) ∨ ($C$243 = $C$231) ∨ ($C$243 = $C$207) ∨ ($C$243 = $C$213) ∨ ($C$243 = $C$237)) THEN ($C$246 = $C$12) ELSE TRUE) ∧ (IF (($C$243 = $C$235) ∨ ($C$243 = $C$241) ∨ ($C$243 = $C$223) ∨ ($C$243 = $C$229) ∨ ($C$243 = $C$217) ∨ ($C$243 = $C$211)) THEN ($C$246 = $C$16) ELSE TRUE) ∧ (IF (($C$243 = $C$206) ∨ ($C$243 = $C$236) ∨ ($C$243 = $C$230) ∨ ($C$243 = $C$218) ∨ ($C$243 = $C$212) ∨ ($C$243 = $C$224)) THEN ($C$246 = $C$11) ELSE TRUE) ∧ (IF (($C$243 = $C$240) ∨ ($C$243 = $C$216) ∨ ($C$243 = $C$222) ∨ ($C$243 = $C$228) ∨ ($C$243 = $C$210) ∨ ($C$243 = $C$234)) THEN ($C$246 = $C$15) ELSE TRUE)
-(assert (and (ite (or (= $C$243 $C$215)
+;; assert (IF (($C$243 = $C$233) ∨ ($C$243 = $C$209) ∨ ($C$243 = $C$239) ∨ ($C$243 = $C$215) ∨ ($C$243 = $C$227) ∨ ($C$243 = $C$221)) THEN ($C$246 = $C$14) ELSE TRUE) ∧ (IF (($C$243 = $C$232) ∨ ($C$243 = $C$220) ∨ ($C$243 = $C$208) ∨ ($C$243 = $C$226) ∨ ($C$243 = $C$238) ∨ ($C$243 = $C$214)) THEN ($C$246 = $C$13) ELSE TRUE) ∧ (IF (($C$243 = $C$237) ∨ ($C$243 = $C$219) ∨ ($C$243 = $C$207) ∨ ($C$243 = $C$213) ∨ ($C$243 = $C$225) ∨ ($C$243 = $C$231)) THEN ($C$246 = $C$12) ELSE TRUE) ∧ (IF (($C$243 = $C$217) ∨ ($C$243 = $C$229) ∨ ($C$243 = $C$241) ∨ ($C$243 = $C$235) ∨ ($C$243 = $C$223) ∨ ($C$243 = $C$211)) THEN ($C$246 = $C$16) ELSE TRUE) ∧ (IF (($C$243 = $C$218) ∨ ($C$243 = $C$212) ∨ ($C$243 = $C$236) ∨ ($C$243 = $C$224) ∨ ($C$243 = $C$230) ∨ ($C$243 = $C$206)) THEN ($C$246 = $C$11) ELSE TRUE) ∧ (IF (($C$243 = $C$228) ∨ ($C$243 = $C$222) ∨ ($C$243 = $C$240) ∨ ($C$243 = $C$216) ∨ ($C$243 = $C$234) ∨ ($C$243 = $C$210)) THEN ($C$246 = $C$15) ELSE TRUE)
+(assert (and (ite (or (= $C$243 $C$233)
               (= $C$243 $C$209)
-              (= $C$243 $C$233)
               (= $C$243 $C$239)
-              (= $C$243 $C$221)
-              (= $C$243 $C$227))
+              (= $C$243 $C$215)
+              (= $C$243 $C$227)
+              (= $C$243 $C$221))
           (= $C$246 $C$14)
           true)
-     (ite (or (= $C$243 $C$220)
+     (ite (or (= $C$243 $C$232)
+              (= $C$243 $C$220)
               (= $C$243 $C$208)
-              (= $C$243 $C$214)
               (= $C$243 $C$226)
               (= $C$243 $C$238)
-              (= $C$243 $C$232))
+              (= $C$243 $C$214))
           (= $C$246 $C$13)
           true)

So far I've only seen this happen with Caching equality constraints for a sequence: List(...).

shonfeder commented 2 years ago

Wow. Good find!

thpani commented 2 years ago

Also, latest z3 does not seem to have this issue! (at least on this particular spec) I've requested a turnkey version bump here: https://github.com/tudo-aqua/z3-turnkey/issues/14

$ time z3-4.8.17-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 bench.smt
577.04s user 2.61s system 99% cpu 9:40.59 total

$ time z3-4.8.17-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 _apalache-out/Termination.tla/2022-09-02T15-49-04_14311296710629792029/log0.smt
2516.66s user 8.20s system 99% cpu 42:08.02 total  # <---- !!!!

$ time z3-4.11.0-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 bench.smt
696.23s user 2.63s system 99% cpu 11:40.47 total

$ time z3-4.11.0-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 _apalache-out/Termination.tla/2022-09-02T15-49-04_14311296710629792029/log0.smt
510.98s user 2.01s system 99% cpu 8:34.06 total
konnov commented 2 years ago

This is amazing! We should probably learn from this how to prioritize constraints to improve performance!

konnov commented 2 years ago

Do you have a complete log? I think it is something right after the equality constraints were cached.

thpani commented 2 years ago

Likely caused by this line:

https://github.com/informalsystems/apalache/blob/a2e967e31684f958af0492a5387f21c6abac1b11/tla-bmcmt/src/main/scala/at/forsyte/apalache/tla/bmcmt/rules/aux/CherryPick.scala#L129

thpani commented 2 years ago

2149 resolves the non-deterministic SMT output, but there's still quite some spread in overall runtime:

10:46 11:45 13:03 15:16 15:43 16:00 16:25 16:43 17:44 18:05 20:43 21:26 31:47 32:39

I will test whether it's fixed by just bumping Z3 as indicated in https://github.com/informalsystems/apalache/issues/2120#issuecomment-1235771144.

thpani commented 2 years ago

Still present with #2149 and Z3 bumped to the lastest release. I have a few remaining ideas to test:

thpani commented 2 years ago

Fixing ZipOracle (#2149) made the offline solver deterministic. So the remaining issue must be in the incremental solver.

There's a few Z3 parameters for this in the combined_solver module:

thpani commented 2 years ago

I'm out of ideas, so I will leave this be for the moment. Maybe it can be fixed by using assumptions (#2137).

thpani commented 1 year ago

I thought #2569 might've fixed this, but it's still present.