Closed LeventErkok closed 1 year ago
Note: at this point tactic.default_tactic=sat sat.euf=true smt.ematching=false works best.
I can confirm that these settings make it go really fast; in fact, significantly faster than the May 29 2021 version of z3.
Feel free to close the ticket as I can use this workaround for this particular case. Thanks.
I don't have setup with me for bisecting where the regression entered. It could be a preprocessing issue, which degrades solvability. In particular, MBQI is required to solve this problem and it is extremely sensitive to syntax for the bit-vector case.
@NikolajBjorner
Unfortunately the tactic you suggested:
tactic.default_tactic=sat sat.euf=true smt.ematching=false
no longer works on this example, nor a direct call to z3 without any options can handle it. Looks like other performance regressions crept in. FYI.
It appears related to a rewrite for s6 * (s0 & s2). Removing that newer rewrite, setting ematching=false, and using the default solver (don't set the other options) it finds a solution within a second with only a few instantiations. Why one method finds the instances and the other doesn't remains to be investigated. It might be sensitive to the values that the bit-vector solver finds for s0 s6 when peforming MBQI.
Yes! I can confirm a new-build with the suggested setting works fine on this example.
I currently don't think this example is suitable for regression testing: it is really brittle what works and what doesn't work. I used the example to test the new core. It definitely helped finding bugs, but also the new core didn't work out of the box (quickly, at least). With some tweaks I get it to work (after disabling elimination of unconstrained variables as a temporary code change, and) by setting sat.phase=always_false. The legacy SMT solver uses a default phase of literals to false, which is optimal for this example. The difference is stark depending on the phase default: it either immediately finds a solution or cycles through a lot of instantiations. There are reasons why the SMT core uses default phase = false, which is related to minimizing congruences, etc. So, it could be good takeaway from this example to use this heuristic.
The example appears to be a synthesis benchmark. You are synthesizing values for s0, s1 that satisfy some global bit-vector property. The strategy that works best is by minimizing s0, s1, and then checking whether the current minimal values for these bit-vectors work for the predicate. Z3 has no guidance along this direction built-in and I am not sure how such guidance would be learned automatically. Maybe something to inspire phase selection heuristics for quantified formulas, but for a regression test I suspect this will toggle between immediate and looong based on butterflies.
for now, reopen this bug for the next few regressions as they will likely point to also real bugs as was the case this time.
Note
(set-option :sat.smt true)
(set-option :smt.ematching false)
(set-option :sat.phase always_false)
(set-option :smt.elim_unconstrained false)
(push)
(define-fun s4 () (_ BitVec 1) #b0)
(declare-fun s0 () (_ BitVec 64))
(declare-fun s1 () (_ BitVec 64))
(assert (forall ((s2 (_ BitVec 64)))
(let ((s3 ((_ extract 7 7) s2)))
(let ((s5 (distinct s3 s4)))
goes through. disable phase or elim-unconstrained and it goes for a hike.
Thanks Nikolaj..
For reference, this is indeed a synthesis problem. The idea is to take a 64-bit vector in the following shape:
1.......2.......3.......4.......5.......6.......7.......8.......
and turn it into another 64-bit value, that looks like this:
12345678........................................................
i.e., shift the numbered positions to the top part. (The dotted positions are ignored, and can result in anything.). The
problem is to do this with one mask and one multiplication; s0
is the mask and s1
is the multiplier.
At first look, this looks like a frivilous thing to do, but it has real applications in game playing (chess engines use this to shuffle positions quickly apparently), and fast implementations of scatter-gather operations in HPC, where you have a mask and you want to collect certain bits in the highest byte, so it's a mask transformer.
A good indicator for why you want to search for smallest bit-weights among s0, s1 is because they appear in positions where they contribute with scrambling the output of s1*(s2 & s0). If you have a family of such problems finding the preferred polarity of the bits of the inputs first and then rolling a home-made MBQI outside of z3 that uses local optimization of max-sat should be superior.
I used this as a sample for digging into details: https://z3prover.github.io/slides/z3internals.pptx and https://microsoft.github.io/z3guide/programming/Example%20Programs/MBQI
The benchmark below was solved in about 20 seconds, using z3 compiled on May 29 2021 from GitHub master, producing the correct result:
Unfortunately, running it with z3 compiled on Oct 25 2021 does not terminate in 10 minutes. Looks like a big performance hit for this particular benchmark.