Closed robdockins closed 4 years ago
This change was done back in 2017-07-29. From Changes.md:
Changed the way satWithAny and proveWithAny works. Previously, these two functions ran multiple solvers, and took the result of the first one to finish, killing all the others. In addition, they waited for the still-running solvers to finish cleaning-up, as sending a 'ThreadKilled' is usually not instantaneous. Furthermore, a solver might simply take its time! We now send the interrupt but do not wait for the process to actually terminate. In rare occasions this could create zombie processes if you use a solver that is not cooperating, but we have seen not insignificant speed-ups for regular usage due to ThreadKilled wait times being rather long.
The commit is here: https://github.com/LeventErkok/sbv/commit/b197f4639b19eec1310ad976f382f78d3f87ff21
So.. It's a double-edged sword. On one hand, most of the time the "fast" way of canceling works well and saves a lot of time for lots of calls. On the other hand you get occasional zombies.
Before deciding further, may I suggest an experiement: Undo the commit above (or create the old function), so it reads something like:
sbvWithAny :: [SMTConfig] -> (SMTConfig -> a -> IO b) -> a -> IO (Solver, b)
sbvWithAny [] _ _ = error "SBV.withAny: No solvers given!"
sbvWithAny solvers what a = snd `fmap` (mapM try solvers >>= waitAnyCancel)
where try s = async $ what s a >>= \r -> return (name (solver s), r)
(You might need a few more mods here to make it compile with latest SBV; but it should be easy.)
See if this old version solves the problem. If so; we can make it configurable: i.e., let the user pick if they want to pay the price of waiting for the other solvers to exit or not. (Alternatively, we can provide this version as an alternate too, with a suggestive name.)
Humm... my experience has been that you pretty much always get zombies rather than occasionally. As mentioned in the linked bug thread, it seems like a Ctrl^C during the proof kills all the provers pretty reliably, but a successful completion by one doesn't kill the others.
I'll play around with alternatives and see what happens.
Seems like there was some change in this behavior recently. When I build with HEAD SBV (instead of 8.6) all the solvers exit before returning a result. Also the winning solver goes from CVC4 at ~3 seconds to ABC at about ~6 seconds on the example in the linked thread.
I can't think of any difference between 8.6 and HEAD that would make such a difference. Very curious.
(I'm assuming you didn't download new versions of the solvers?)
Exact same solvers
Now I'm really curious what might cause this. Can't think of any reason why.
I'm going to try bisecting and see if I can figure out what's going on.
Ok, one mystery solved, to be replace with another. The generated SMT is the same between SBV 8.6 and HEAD: the reason CVC4 isn't winning is that SBV now doesn't find my CVC4. Maybe a solver version issue? It also doesn't find mathsat
. So, actually I'm not convinced anymore the behavior is different, I just may be seeing different timing and such because the portfolio is different.
This is my CVC4
This is CVC4 version 1.7-prerelease [git master 1955e4b5]
compiled with GCC version 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.10.44.2)
on Oct 26 2018 15:21:43
Any my MathSAT
MathSAT5 version 5.5.4 (cb2aadea7102) (Feb 22 2019 08:54:41, gmp 5.1.3, clang/LLVM 6.0, 64-bit)
This is what I have on my box:
Prelude Data.SBV> map (name . solver) <$> sbvAvailableSolvers
[Z3,Yices,Boolector,CVC4,MathSAT,ABC]
If your list doesn't list some of these, then indeed SBV doesn't find them.
What happens when you run something like:
proveWith mathSAT $ \x -> x .== (x::SWord8)
Do you get some reasonable error message?
With HEAD, I get:
> map (name . solver) <$> sbvAvailableSolvers
[Z3,Yices,ABC]
> proveWith mathSAT $ \x -> x .== (x::SWord8)
*** Exception:
*** Data.SBV: Unexpected non-success response from MathSAT:
***
*** Sent : (set-option :global-declarations true)
*** Expected : success
*** Received : unsupported
***
*** Exit code : ExitFailure (-15)
*** Executable: /Users/rdockins/.local/bin/mathsat
*** Options : -input=smt2 -theory.fp.minmax_zero_mode=4
***
*** Reason : Backend solver reports it does not support this option.
*** Check the spelling, and if correct please report this as a
*** bug/feature request with the solver!
> proveWith cvc4{ transcript = Just "bad.smt2" } $ \x -> x .== (x::SWord8)
*** Exception:
*** Data.SBV: fd:15: hGetLine: end of file:
***
*** Sent : (set-option :print-success true)
***
*** Executable: /usr/local/bin/cvc4
*** Options : --lang smt --incremental --interactive --no-interactive-prompt --model-witness-value
***
*** Hint : Solver process prematurely ended communication.
*** It is likely it was terminated because of a seg-fault.
*** Run with 'transcript=Just "bad.smt2"' option, and feed
*** the generated "bad.smt2" file directly to the solver
*** outside of SBV for further information.
The generated CVC4 transcript:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; SBV: Starting at 2020-06-24 15:31:21.264794 PDT
;;;
;;; Solver : CVC4
;;; Executable: /usr/local/bin/cvc4
;;; Options : --lang smt --incremental --interactive --no-interactive-prompt --model-witness-value
;;;
;;; This file is an auto-generated loadable SMT-Lib file.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; [15:31:21.269] [Timeout: 5s] Sending:
(set-option :print-success true)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;; SBV: Caught an exception at 2020-06-24 15:31:21.27774 PDT
;;;
;;; *** Data.SBV: fd:15: hGetLine: end of file:
;;; ***
;;; *** Sent : (set-option :print-success true)
;;; ***
;;; *** Executable: /usr/local/bin/cvc4
;;; *** Options : --lang smt --incremental --interactive --no-interactive-prompt --model-witness-value
;;; ***
;;; *** Hint : Solver process prematurely ended communication.
;;; *** It is likely it was terminated because of a seg-fault.
;;; *** Run with 'transcript=Just "bad.smt2"' option, and feed
;;; *** the generated "bad.smt2" file directly to the solver
;;; *** outside of SBV for further information.
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;; SBV: Finished at 2020-06-24 15:31:21.278007 PDT
;;;
;;; Exit code: ExitFailure 1
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Yeah, those are too old.. I have:
$ mathsat -version
MathSAT5 version 5.6.3 (973d615f065f) (May 12 2020 08:47:18, gmp 5.1.3, clang/LLVM 6.0, 64-bit)
and:
$ cvc4 --version
This is CVC4 version 1.8-prerelease [git master 9712a20e]
compiled with GCC version 4.2.1 Compatible Apple LLVM 11.0.3 (clang-1103.0.32.59)
on May 19 2020 18:16:06
Should be possible to use older versions with SBV head too; but with tweaks to the options passed.
Any chance you can upgrade them?
With a lot of debug tracing, I think I've unraveled this. While waiting for solvers, the process is actually blocked on the hGetLine
here:
https://github.com/LeventErkok/sbv/blob/47e5b9a8eb9275a13381291b57dae1a2d5fd448c/Data/SBV/SMT/SMT.hs#L667
The nearest enclosing exception handler, I think, is here: https://github.com/LeventErkok/sbv/blob/47e5b9a8eb9275a13381291b57dae1a2d5fd448c/Data/SBV/SMT/SMT.hs#L690
Note! it uses the handleAsync
combinator which explicitly rethrows Async exceptions without handling them.
If it isn't an async exception, it is bundled into a SolverException
, which then flows here:
https://github.com/LeventErkok/sbv/blob/47e5b9a8eb9275a13381291b57dae1a2d5fd448c/Data/SBV/SMT/SMT.hs#L704
At this point we sent a kill signal to the solver and unwind with a more specific exception. This is what we want.
However! the sbvWithAny
combinator uses throwTo
with ThreadKilled
, which as an async exception.
https://github.com/LeventErkok/sbv/blob/47e5b9a8eb9275a13381291b57dae1a2d5fd448c/Data/SBV/Provers/Prover.hs#L739
As such, the handleAsync
combinator filters it out, and the exception bubbles out of the above referenced handlers, leaving zombie solver processes spinning away.
Probably should be throwing a non-async exception in these cases. I was using ExitSuccess
in testing, and that seems to fix the problem.
Impressive sleuthing! Glad you got to the bottom of it. Please file a PR.
@robdockins
SBV 8.7, which fixes this issue amongst other things, is now on https://hackage.haskell.org/package/sbv-8.7
Thanks Levent!
Thanks for the patch!
As best as I can tell, the following Cryptol issue is essentially due to SBV's
proveWithAny
.https://github.com/GaloisInc/cryptol/issues/693
The SBV documentation claims that other threads are killed. A glance at the implementation suggests this is in fact the case. However, that doesn't seem to actually terminate the associated external processes, which then keep attempting to solve until, presumably, the next time they try to communicate.