z3py: unknown returned for a 16 bitwidth bitvector formula with unrolled loop

harishankarv commented 3 years ago

I'm trying to figure out why the z3 solver returns an unknown for a particular formula.

The formula is representative of a certain piece of code with a loop which does "abstract" multiplication (which is inspired by the long multiplication algorithm).

To check if my abstract algorithm works correctly (passes my verification condition) I encode the negation of the condition to get a counterexample from z3. To encode the formula in z3 understandably involves unrolling the loop. I set the timeout to 0 (i.e. no timeout), and use a SolverFor("QF_BV"). I am using 40 threads with parallel.enable.

While I do get unsat for a bitvector width of 2 through 14, when I choose a bitvector of width 16, z3 gives me an unknown in about 11 hours (this time varies).

Calling .reason_unknown() on the solver, I get: incomplete

Calling .statistics() on the solver gives the following output:

(:eliminated-vars                  1034
 :lh-add-binary                    140371855
 :lh-bca                           23814396
 :lh-bool-var                      2231207637
 :lh-cube-backtracks               105014
 :lh-cube-conflicts                35005
 :lh-cube-cutoffs                  70009
 :lh-double-lookahead-propagations 2528053
 :lh-double-lookahead-rounds       1241284
 :lh-propagations                  158526025
 :lh-windfalls                     26038702
 :max-memory                       44090.71
 :memory                           177.62
 :num-allocs                       27798866265032752.00
 :par-progress                     100.00
 :par-unsat                        34999
 :rlimit-count                     764946247
 :sat-ate                          222268
 :sat-backjumps                    3002301117
 :sat-backtracks                   147
 :sat-bce                          104728
 :sat-conflicts                    3002336240
 :sat-decisions                    3454552584
 :sat-del-clause                   3371028754
 :sat-elim-bool-vars-res           1361521
 :sat-elim-clauses                 1059679366
 :sat-elim-literals                1080799650
 :sat-gc-clause                    1918637627
 :sat-minimized-lits               858977833
 :sat-mk-clause-2ary               40013037
 :sat-mk-clause-3ary               97306887
 :sat-mk-clause-nary               3779658115
 :sat-mk-var                       556299461
 :sat-probing-assigned             122001
 :sat-propagations-2ary            3293681175
 :sat-propagations-3ary            2950499967
 :sat-propagations-nary            3439887739
 :sat-restarts                     371680015
 :sat-scc-elim-binary              3427928
 :sat-scc-elim-vars                1188920
 :sat-subs-resolution              196847601
 :sat-subs-resolution-dyn          1928526157
 :sat-subsumed                     252105065
 :sat-tr                           1131060
 :sat-units                        47282317
 :time                             42930.13)

What is this supposed to mean, and how am I supposed to interpret this? Does z3 return unknown because the solver thinks it will take too long to solve? Or is it because z3 can't even solve it, maybe because the theory isn't expressive enough? Or is it some other reason?

Do let me know if you need any specifics (about the code or the z3 encoding). I also have a run with high verbosity, and can link to a pastebin if needed.

NikolajBjorner commented 3 years ago

There is probably something amiss with how resource bounds are handled in parallel mode. It could be that some resource bounds are applied in some phases and the parallel tactic incorrectly handles an exhaustion somewhere. Given that it takes 11 hours to get to these places, it isn't a simple matter to set up and try a few pokes. The relevant code should be Line 290 in parallel_tactic.cpp. It looks for either sat.giveup or incomplete error returns from the SAT solver. The SAT solver provides sat.giveup only when it has external solvers, which doesn't apply to you. The inc_sat_solver.cpp in line 220 may also set this error state on an exception. It is more likely that your example is exercising this code-path for a case where the solver ran out of virtual memory. The parallel tactic masks the error message by default. I now pushed some small updates to some of the code paths mentioned above. Under the hypothesis that the SAT solver contains some useful information that gets filtered it now adds this information to verbose output. It seems some more improvements to diagnostics is called for, but I hope this would answer whether it is a virtual memory issue.

NikolajBjorner commented 3 years ago

Note also that "parallel" mode has many knobs that can be tuned. I have often observed that the default settings provide inferior experience to pure sequential mode.

harishankarv commented 3 years ago

@NikolajBjorner Thanks so much for getting back and also for the pushing those updates. Like you say it isn't easy to try parallel mode with all the different knobs in my case since this take 11 hours (at the least, a recent run returned unknown in 25 hours).

Also, in my case, the verbose output (verbose=10) produces a log that is really large (~3GB), and is difficult to peruse. I took a look at the source and see that possibly cube simplifications exceeded could be one of the possible reasons for an unknown, but apart from that I'm not sure what to look for in the large log file. Any pointers in that direction would be helpful.

I can try out sequential mode instead, but the solving time could be significantly larger due to loss of parallel mode speedup.

NikolajBjorner commented 3 years ago

Anyhow, the updated binaries have more informed error reporting. You should get a more descriptive reason for unknown next time. I hold it likely to be due to memory pressure.

Z3Prover / z3

z3py: unknown returned for a 16 bitwidth bitvector formula with unrolled loop #5009