Closed dbueno closed 5 years ago
possibly doable for UF and AUFBV like formulas: The model object may be fed a cache comprising of the current literals. Z3 uses "model transformers" to deal with preprocessing transformations and hiding internal variables. For some transformations the cache would become invalid, but not too likely in the EUF case. It still requires some evaluation to understand whether this approach is going to work.
The alternative is for the consumer of models to build a custom eval that understands how to do something better than the linear scan of function tables that happens for n-ary functions and predicates.
A significant limitation of just caching solver state is that bit-vectors and arithmetic terms are rewritten during pre-processing so the atomic formulas maintained by the solver are unrelated to the input.
To narrow in on what is required, could you upload a representative example in smtlib2 format using eval/evaluate?
Thanks for following up. I admit I'm not sure what action to take based on your first comment. I created an example C++ program and an SMT2 file that shows the behavior I'm trying to avoid in eval(). In the gist there are two files: z3eval_slow.cc and z3eval_slow.smt2 https://gist.github.com/dbueno/70d13750d25a1fbb71b687c65fedb5eb. After compiling, just run it on the smt2 file.
While I was working on this example, I figured out a temporary workaround (guarded by #ifndef FAST
in the file). However, even with that I don't know how to explain why it's so much faster, since it only avoids about 1/3 of the evals but the runtime is 7s vs originally 2m30s on my machine.
Let me know if there's anything else you need or I can explain what my test is doing.
A bottom-up labeling of the formula should be faster than your top-down evaluation. The built-in evaluator uses very few shortcuts and it has a bottleneck on SGT_32_32 and similar functions. I don't think I will have bandwidth to get over the compilation issues so I suggest some approaches:
I've updated the files for this issue: https://gist.github.com/dbueno/70d13750d25a1fbb71b687c65fedb5eb
My delay was due to finding an interesting workaround for the time being and doing further investigation. I updated the example so it should compile under the exact same compiler flags that z3 uses (I was using C++17 features last time and now I'm only using C++11).
bool_rewriter:mk_flat_and_core(unsigned, expr *, obj_ref<expr, ast_manager>&)
. You can look the profile screenshot here (https://gist.github.com/dbueno/70d13750d25a1fbb71b687c65fedb5eb#file-profile-png). I believe this method is flattening many 2-arg ANDs that are nested into a single AND with >10k args; in the process, it's creating many unnecessary, intermediate ANDs. I haven't tested this hypothesis, but I ran into a similar problem when I wrote my own AND-flattener.To see the behavior I saw, run the cc file on the smt2 file. To see the behavior go away, add a second argument "fast" to the cc file. This changes one line in my depth-first traversal of the formula and then the evaluation will terminate after a few seconds. The line does one thing: when it is traversing the args of a true AND (resp. false OR), it avoids calls eval() on the children of these nodes, since it’s known that all the children of a true AND must be true (resp. children of a false OR must be false).
I would still like to optimize this process so let me know what else I can provide.
Your analysis seems right: the evaluator asks the simplifier to flatten AC operators (and, or, +, *, bvadd, bvmul). This seems not well-motivated for how it is used from the API. Changing this behavior could break something. An experiment is to change https://github.com/Z3Prover/z3/blob/master/src/model/model_evaluator.cpp#L79 to assign flat to false instead of true.
The more systematic solution is to write custom bottom-up evaluator that does a linear pass on the expressions to label them with values instead of rewriting them to normal form.
I changed flat to false and the runtime is shorter. It takes a little over 2 minutes even though solving is practically instantaneous. Here is the new profile: https://gist.github.com/dbueno/70d13750d25a1fbb71b687c65fedb5eb#file-profile2-png
All the time is in rewriter_tpl::process_app
Does this mean the bottom up evaluator is the way to go?
yes
Problem
I'm writing a model checker. On very large formulas, my bottleneck is
eval()
ing formulas in Z3. Evaluating Boolean formulas takes too much time doing rewriting, even with cached rewrites. I'd like to access the Boolean assignment from the solver directly.The time
eval()
takes is critical because I query the model for every Boolean subformula of the input formula. I don't need to simplify the formula or to complete the model; I just want to know, is the assignment to a subformula true, false, or irrelevant.Context
I use Z3 4.8.5 as a backend solver. I'm using the C++ API. I configure each solver with auto-config=false, using either theory
QF_UF
orQF_AUFBV
. I'm usingpush()
andpop()
as well as solving under assumptions.Attempted solution
I tried modifying the formulas sent to Z3 by adding Tseitin-like variables. I introduced, for a given subformula
f
, a fresh variablet
and the constraintt = f
. The goal of this was to calleval(t)
whenevereval(f)
was demanded, obviating Z3's need to rewrite anything, sincet
is just a variable.Unfortunately, this solution perturbed the models found during the many internal checking queries; in all cases I tested, the model checker's performance was significantly worse. There was no obvious reason for the models to be different.
Solution
Is it possible to expose the Boolean assignment to the end user?
The
smt_context
has a method,get_assignment
. If I were to try, I'd expose this method throughsmt::kernel
,smt_solver
,solver
, all the way through z3's public c++ interface. Presumable, then, I'd be able to callget_assignment
on Booleanz3::expr
nodes and access their assignments quickly.My questions are: is this a good proposed solution? Is it going to work even after all the pre-processing that Z3 does on the input formula? Will it work for theories other than
QF_UF
andQF_AUFBV
? Is there another way I should consider?Thank you for reading.