sosy-lab / java-smt

JavaSMT - Unified Java API for SMT solvers.
Apache License 2.0
186 stars 46 forks source link

Closing Z3 context consumes a lot of time #236

Closed hernanponcedeleon closed 3 years ago

hernanponcedeleon commented 3 years ago

I'm trying to replace the use of z3 API with JavaSMT in the dartagnan tool. Most of the integration is looking good so far but we just noticed some performance problems when using Solvers.Z3 as the underlying solver in MacOS.

Calling SolverContext.close() is extremely slow for some benchmarks, e.g. for a formula where the solving time is around 9 secs, closing the context takes more than 20 secs.

This is surprisingly since the concrete implementation of close() in Z3SolverContext seems to call Native.delContext(context) which is the same the Java bindings for z3 does.

Any idea what the reason might be? Any further information I could provide to find the root of this?

kfriedberger commented 3 years ago

The Z3SolverContext does a little bit more than just closing the context:

Can you use a Java profiler and locate the exact section where the time is spent? Maybe even a simple command like jstack provides enough info here, when executed in the moment of closing, maybe even several times in the given 20s interval.

ThomasHaas commented 3 years ago

I'm also working on Dartagnan and I noticed that the call to Native.delContext(context) is the one that takes time. I just used the IntelliJ Debugger to walk through the code step-by-step and each call performed in close() is done almost instantly except the Native.delContext(context) which can take upwards of 2 minutes to complete.

hernanponcedeleon commented 3 years ago

In my case is also Native.delContext(context) the one consuming the time. About the log ... I'm using a default configuration

Configuration config = Configuration.defaultConfiguration();
SolverContext ctx = SolverContextFactory.createSolverContext(
    config, 
    BasicLogManager.create(config), 
    ShutdownManager.create().getNotifier(), 
    Solvers.Z3); 

but I assume Native.closeLog() is the method that takes care of writing/flushing the log and this method terminates instantly.

hernanponcedeleon commented 3 years ago

Small updated ... I manage to make the solvers work also in Linux. Closing the context for z3 also consumes a lot of time here (for certain benchmarks). The same happens for mathsat4. However closing the context for cvc4 is quite fast.

kfriedberger commented 3 years ago

The symptoms are very vague and hard to reproduce without further information. Could you describe your use-case:

ThomasHaas commented 3 years ago

One of our benchmarks has around 20 seconds solving time and takes at least 4 minutes to close. That example has:

This exactly mimics the way we used Z3 before. We replaced all calls to Z3 with the respective calls to JavaSMT.

As far as I can tell, this problem seems to appear in all our medium-sized and larger benchmarks.

hernanponcedeleon commented 3 years ago

Would it help if we generate the SMT2 file out of the encoding?

kfriedberger commented 3 years ago

Maybe it might be helpful. The problem might only be caused when using the binary API instead of SMTLIB2.

hernanponcedeleon commented 3 years ago

Btw ... When googling if people have experienced this directly when using Z3, I found this which mentions @PhilippWendler ... Maybe he has some idea

kfriedberger commented 3 years ago

I will also work on tests for #237. Within a few minutes, I found performance issues in Princess and SMTInterpol. :-)

PhilippWendler commented 3 years ago

Btw ... When googling if people have experienced this directly when using Z3, I found this which mentions @PhilippWendler ... Maybe he has some idea

@cheshire was JavaSMT maintainer back then. The reason why he mentioned me in this comment was that two days before I wrote the following to him via mail:

since recently, I noticed that when using Z3 CPAchecker very often does not shutdown nicely as it should (in at most a few seconds).

Instead, it hangs or does a lot of work in Z3SolverBasedProver.close, which calls Z3SolverBasedProver.pop.

I suspect this might be because of the recent changes you asked the Z3 people to make?

However, @cheshire replied that at the time I wrote this, the Z3 version in JavaSMT did not even include the two commits from https://github.com/Z3Prover/z3/issues/679 yet. (On 2016-07-12, 08c4f488a75acc7f1d71a6feb1c5f19d2db78948 upgraded to z3-4.4.1-1558-gf96cfea and only on 2016-07-22 f3f8d567ee1e5dabcdd933e1ecb5c41290b2f5c3 upgraded to z3-4.4.1-1624-g82d0310. The commits in question are z3-4.4.1-1569-gb080e3a21 and z3-4.4.1-1570-g4720d578a.) Furthermore, he indicated that https://github.com/Z3Prover/z3/issues/679 was about closing the context and my problems have been with closing the prover.

So my problems back then might not have been related to https://github.com/Z3Prover/z3/issues/679, but of course it could be that the current problems are.

Regarding the discussion https://github.com/Z3Prover/z3/issues/679 I don't know more than that.

There have been quite a few other changes related to closing and cleanup back then (I know https://github.com/Z3Prover/z3/issues/245, https://github.com/Z3Prover/z3/pull/648, https://github.com/Z3Prover/z3/pull/661), but these are more on the Java side and should not influence Native.delContext, I guess.

Some ideas:

hernanponcedeleon commented 3 years ago

Maybe it might be helpful. The problem might only be caused when using the binary API instead of SMTLIB2.

Here is the smt2 file of one of the problematic cases using Solvers.Z3:

At the java level, the code looks like

prover.addConstraint(A);
prover.isUnsatWithAssumptions(B);
prover.addConstraint(C);
prover.isUnsat();

As @ThomasHaas mentioned, the first check in this example is SAT. The smt2 file corresponds to A /\ B /\ C.

encoding.smt2.zip

hernanponcedeleon commented 3 years ago
  • Does Z3 still have this replay log? Then try dumping such a query and replay it, and check whether the replay also includes the costly close. If yes, this would be easy to report to the Z3 developers.

I did not get it ... are you talking about the Z3 API? what do you mean by "dumping such a query"?

In our version of the code where we use Z3 API instead of JavaSMT (the code is identical modulo method renaming, e.g. solver.add() to prover.addConstrain() etc) the call to ctx.close() was almost immediate.

PhilippWendler commented 3 years ago
  • Does Z3 still have this replay log? Then try dumping such a query and replay it, and check whether the replay also includes the costly close. If yes, this would be easy to report to the Z3 developers.

I did not get it ... are you talking about the Z3 API? what do you mean by "dumping such a query"?

I am talking about the replay log of Z3, and suggest to create such a log for a query with a costly close. With JavaSMT the option solver.z3.log can be used to create such a log (if nothing has changed since I last used it). Then one can use a Z3 binary on the command line to replay the query and check if it also takes so long. If yes, then one can report this to the developers, they should be able to reproduce it and explain what takes so long.

In our version of the code where we use Z3 API instead of JavaSMT (the code is identical modulo method renaming, e.g. solver.add() to prover.addConstrain() etc) the call to ctx.close() was almost immediate.

Then it seems trying to recreate the problem like I suggested as a C problem is unlikely to succeed.

Idea: use ltrace to dump all C API calls to Z3 from your program, once using the direct Z3 API and once using JavaSMT, and compare this. I guess there has to be a difference in the API usage earlier that leads to the later difference.

hernanponcedeleon commented 3 years ago

I am talking about the replay log of Z3, and suggest to create such a log for a query with a costly close. With JavaSMT the option solver.z3.log can be used to create such a log (if nothing has changed since I last used it). Then one can use a Z3 binary on the command line to replay the query and check if it also takes so long. If yes, then one can report this to the developers, they should be able to reproduce it and explain what takes so long.

If I understood correctly, this would create a file that I can feed as an input to the Z3 binary and hopefully reproduce the issue.

Is there a way a way to set this options directly at the java level (we are not yet using configuration file)? Maybe something like the code below?

Configuration config = Configuration.builder().setOption(name, value).build();
PhilippWendler commented 3 years ago

Exactly.

hernanponcedeleon commented 3 years ago

Are the entries as follow?

Do I need somehow to specify the log file name or where to create it?

PhilippWendler commented 3 years ago

The name would be "solver.z3.log" and value would be desired file name of the output file (e.g., "z3.log", can also be an absolute path etc.).

Btw.: All configuration options of JavaSMT are documented in a text file named org/sosy_lab/java_smt/ConfigurationOptions.txt inside JavaSMT's JAR. A current snapshot is always at https://sosy-lab.github.io/java-smt/ConfigurationOptions.txt (linked under "Quick links" in the readme).

hernanponcedeleon commented 3 years ago

Wow ... by taking a look to ConfigurationOptions.txt I noticed the solver.z3.usePhantomReferences (phantom references were mentioned in Z3Prover/z3#245).

I set this option to true in the configuration and this completely solved the performance problem. Is there any further consequence of using this option?

PhilippWendler commented 3 years ago

Interesting.

What this option does is to couple Java's garbage collection and Z3's reference counting: After a Z3Formula instance gets cleaned up by the garbage collector, JavaSMT will call Native.decRef() such that Z3 can also clean it up if there are no other references. Without phantom references, this does not happen and Z3 never gets a chance to clean up formula instances before the context is closed.

For your case, the main effect is probably that our Z3SolverContext.close() calls decRef() for all created formula instances before calling delContext() if phantom references are enabled. It seems that this manual calling of decRef() and the associated cleanup is more efficient than the cleanup that Z3 does in delContext() if the formula instances still exist. So this indicates an efficiency bug in delContext() and should be worth reporting to Z3's developers (the method should be able to just do the same as we do externally or potentially be even more efficient, not slower).

Apart from this, the effects of enabling phantom references are:

So if you have an application that fits the usage pattern of the last item, it should be worth it. If not, e.g., because you just have short-lived contexts or because you keep references to basically all formulas anyway forever, then probably not (but you could benchmark it). This is why it is not enabled by default.

PhilippWendler commented 3 years ago

Note that it seems to me that since https://github.com/Z3Prover/z3/pull/648, Z3's own Java API always uses phantom references and also iterates over all created objects and cleans them up manually before calling delContext() (in com.microsoft.z3.Context.close()). And even before that, it seems that it kept a list of all created references for cleanup. So this would explain why JavaSMT with phantom references and Z3's own API are similarly fast.

However, I would still say that delContext() should be improved for the case where not all references have been cleaned up. After all, that would also benefit all other users of Z3 in languages without GC, if they don't meticulously clean up all references manually before delContext().

hernanponcedeleon commented 3 years ago

Solved. Thanks a lot for the help!