--multiheuristic option shows non-deterministic behaviors in return

soonhokong commented 8 years ago

@clhuang and @dzufferey, could you take a look at this problem when you have some time?

I have been testing --multiheuristic option with hard instances. I found that using the option changes the return value (delta-sat/unsat) from a run to another run. Here is an example:

$ dReal --multiheuristic dreal3/src/tests/nra/mass_spring.smt2
delta-sat with delta = 0.00100000000000000

$ dReal --multiheuristic dreal3/src/tests/nra/mass_spring.smt2
/Users/soonhok/work/dreal3/src/tests/nra/mass_spring.smt2
unsat

Another symptom is that sometimes dReal returns delta-sat with a witness box (B). However, the width of the box |B| is larger than the given precision. Here is an example:

$ dReal inv_pend.smt2 --multiheuristic --model --precision 0.00001
Solution:
V : [-20, 20] = [0, 0.6765565947500003]
Vdot : [-20, 20] = [-10, 0]
ball_diameter : [-10, 10] = [0.15, 0.22330625]
theta : [0.01, 0.2] = [0.01, 0.0575]
thetadot : [-0.3, 0.3] = [-0.3, 0.3]
x : [0.01, 0.2] = [0.01, 0.2]
xdot : [-0.3, 0.3] = [-0.3, 0.3]
delta-sat with delta = 0.00001000000000000

Here the variable ball_diameter has quite a large interval [0.15, 0.22330625] while 0.00001 was given as a precision. Here is inv_pend.smt2 file that I used:

(set-logic QF_NRA)
(declare-fun x () Real [0.01,0.2])
(declare-fun xdot () Real [-0.3,0.3])
(declare-fun theta () Real [0.01,0.2])
(declare-fun thetadot () Real [-0.3,0.3])
(declare-fun ball_diameter () Real [-10,10])
(declare-fun V () Real [-20,20])
(declare-fun Vdot () Real [-20,20])
(assert (not (and (or (<= 0 V) (not (<= 0.01 ball_diameter))) (or (<= 0 Vdot) (not (<= 0.01 ball_diameter))))))
(assert (= (+ (+ (+ (+ 0 (^ x 2)) (^ xdot 2)) (^ theta 2)) (^ thetadot 2)) ball_diameter))
(assert (= (+ (+ (+ (* xdot (+ (+ (+ (* 0.972632 x) (* 1.32741 xdot)) (* -1 (* 3.77814 theta))) (* -1 (* 0.697897 thetadot)))) (* x (+ (+ (+ (* 1.72917 x) (* 0.972632 xdot)) (* -1 (* 2.63237 theta))) (* -1 (* 0.499053 thetadot))))) (* -1 (* (* 1 theta) (+ (+ (+ (* 2.63237 x) (* 3.77814 xdot)) (* -1 (* 13.0924 theta))) (* -1 (* 2.12247 thetadot)))))) (* -1 (* (* 1 thetadot) (+ (+ (+ (* 0.499053 x) (* 0.697897 xdot)) (* -1 (* 2.12247 theta))) (* -1 (* 0.425955 thetadot)))))) V))
(assert (= (+ (+ (+ (+ 0 (* xdot (+ (+ (+ (+ (* xdot 0.972632) 0) (+ (* x 1.72917) (* (+ (+ (+ (* 1.72917 x) (* 0.972632 xdot)) (* -1 (* 2.63237 theta))) (* -1 (* 0.499053 thetadot))) 1))) (+ (* -1 (+ (* (* 1 theta) 2.63237) 0)) 0)) (+ (* -1 (+ (* (* 1 thetadot) 0.499053) 0)) 0)))) (* (/ (* -1 (+ (+ (+ (* (* -6 (sin theta)) (^ thetadot 2)) (* 100 (+ (+ (+ (* 0.0490286 x) (* 0.26262 xdot)) (* -1 (* 8.85081 theta))) (* -1 (* 1.14305 thetadot))))) (* -1 (* 10 xdot))) (* (* 147 (cos theta)) (sin theta)))) (* 5 (^ (* 3 (cos theta)) -12))) (+ (+ (+ (+ (* xdot 1.32741) (* (+ (+ (+ (* 0.972632 x) (* 1.32741 xdot)) (* -1 (* 3.77814 theta))) (* -1 (* 0.697897 thetadot))) 1)) (+ (* x 0.972632) 0)) (+ (* -1 (+ (* (* 1 theta) 3.77814) 0)) 0)) (+ (* -1 (+ (* (* 1 thetadot) 0.697897) 0)) 0)))) (* thetadot (+ (+ (+ (+ (* xdot -3.77814) 0) (+ (* x -2.63237) 0)) (+ (* -1 (+ (* (* 1 theta) -13.0924) (* (+ (+ (+ (* 2.63237 x) (* 3.77814 xdot)) (* -1 (* 13.0924 theta))) (* -1 (* 2.12247 thetadot))) 1))) 0)) (+ (* -1 (+ (* (* 1 thetadot) -2.12247) 0)) 0)))) (* (/ (* -1 (+ (+ (+ (* (* (* -3 (cos theta)) (sin theta)) (^ thetadot 2)) (* 343 (sin theta))) (* (* 50 (+ (+ (+ (* 0.0490286 x) (* 0.26262 xdot)) (* -1 (* 8.85081 theta))) (* -1 (* 1.14305 thetadot)))) (cos theta))) (* -1 (* (* 5 xdot) (cos theta))))) (+ (* 3 (^ (cos theta) 2)) -14)) (+ (+ (+ (+ (* xdot -0.697897) 0) (+ (* x -0.499053) 0)) (+ (* -1 (+ (* (* 1 theta) -2.12247) 0)) 0)) (+ (* -1 (+ (* (* 1 thetadot) -0.425955) (* (+ (+ (+ (* 0.499053 x) (* 0.697897 xdot)) (* -1 (* 2.12247 theta))) (* -1 (* 0.425955 thetadot))) 1))) 0)))) Vdot))
(assert (> ball_diameter 0.15))
(check-sat)
(exit)

My environment is:

dReal Version: latest master branch af960ab26731a35e6e077d4bb6c7975d168f880b
OS X 10.11.4 + clang-3.6.2

soonhokong commented 8 years ago

FYI, I added ctest label for --multiheuristic and --multiprune options. For example, try ctest -R nra_multiheuristic --output-on-failure in your build directory after building dReal. More information about ctest options is available at https://cmake.org/cmake/help/v3.0/manual/ctest.1.html.

clhuang commented 8 years ago

So I've tried a couple things, and I've been able to reproduce the issue, even when there's no data being shared between the threads (i.e. the common hull is not accessed or modified by either thread); essentially, the threads are just running independently (see clhuang/dreal3:bugfix), up until the point that one finishes.

Despite the apparent independence of the threads, it does seem like their behavior is linked somehow--if two different heuristics are used, the bug is reproducible; if the SizeBrancher is used for both, the bug does not appear. Or if there's only one heuristic used, the bug does not appear.

My only guess would be that there is some state in the contractor, which is shared between the heuristics, that cause some issues when doing subsequent contractions...Is there any way to completely copy a contractor, to see if not sharing a contractor makes a difference?

soonhokong commented 8 years ago

My only guess would be that there is some state in the contractor, which is shared between the heuristics, that cause some issues when doing subsequent contractions...Is there any way to completely copy a contractor, to see if not sharing a contractor makes a difference?

You're right, contractor class has a state which can be changed by calling prune method. Internally (in contractor_cell), I have a way to save/restore a state. I'll expose them as public API of contractor class so that you can use it. I'll ping you when it's ready.

clhuang commented 8 years ago

I'm pretty sure it's a contractor issue at this point. I've tried it on 13.smt, which has a solution at X=2.40, Y=0.675, and when printing out the X and Y ranges (BEFORE/AFTER refer to the state of the box before and after the contractor call, and the following number is the thread number):

BEFORE0 X: [2.20008, 2.51301] Y: [0.632766, 0.697291] AFTER0 X: [2.36998, 2.45647] Y: [0.66861, 0.686114] BEFORE1 X: [2.20008, 2.51301] Y: [0.632766, 0.697291] AFTER1 X: [2.36998, 2.45647] Y: [0.66861, 0.686114] BEFORE0 X: [2.36998, 2.41323] Y: [0.66861, 0.686114] AFTER0 X: [nan, nan] Y: [0.66861, 0.686114] BEFORE0X : [2.41323, 2.45647] Y: [0.66861, 0.686114] AFTER0 X: [nan, nan] Y: [0.66861, 0.686114] NO SOLUTION FOUND 0 BEFORE1 X: [2.36998, 2.41323] Y: [0.66861, 0.686114] AFTER1 X: [2.38546, 2.40926] Y: [0.671778, 0.676619] unsat

and in the bolded portion, the contractor eliminates the solution, leading to an erroneous unsat.

To avoid this issue by saving/restoring contractor state, that would mean we'd have to save/restore contractor state every time it's used, is that practical?

soonhokong commented 8 years ago

To avoid this issue by saving/restoring contractor state, that would mean we'd have to save/restore contractor state every time it's used, is that practical?

It's not ideal, but the size of this state is quite small. Two bit-vectors of size N and one set of pointers. Let's see how big the performance penalty this change can make.

soonhokong commented 8 years ago

@clhuang, I added another function to src/icp/icp.cpp in cf6c419:

// Prune a given box b using ctc, but keep the old state of ctc void test_prune(box & b, contractor & ctc, SMTConfig & config)

It's a derivation of the function that you used to call, void prune(box & b, contractor & ctc, SMTConfig & config, std::unordered_set<std::shared_ptr<constraint>> & used_constraints). The difference is that 1) test_prune doesn't take std::unordered_set<std::shared_ptr<constraint>> & anymore, and 2) it keeps the internal state of a given argument contractor & ctc.

Please try this one, and let me know if you find further problem or need something else. Thank you.

soonhokong commented 8 years ago

Also, please feel free to add more assert in the code, or testcases under src/tests.

clhuang commented 8 years ago

It still returns unsat I think there's some more state in contractor, beyond the output bit vector and the used constraints, because if you look at contractor_common.h it seems that the bitvector and the used_constraints get cleared no matter what. Is it possible there's more state involved, say in the sub-contractors of a contractor_seq for example?

soonhokong commented 8 years ago

@clhuang, you're right. I think I need to introduce test_prune to all the subclasses of contractor_cell. I'll do that soon.

dzufferey commented 8 years ago

@soonhokong Rather than having a state in the contractor, would it make more sense to make the contractor as pure as possible, i.e., modulo IBEX, and carry any state used by the contractor as part of a state monad (or something along those lines) that is explicitly passed as argument. The SMTConfig object is already carried around. Could we generalize that to into an object that carries any state needed by the contractor. It seems that we need such an object. The clause manager for the learning, or keeping track of the proof would also benefit from it.

soonhokong commented 8 years ago

@dzufferey, I see. I agree that this is better and fits better with what we've discussed before (clause manager, proof objects).

clhuang commented 8 years ago

@soonhokong, @dzufferey, are there any current efforts to refactor the contractor?

soonhokong commented 8 years ago

Not right now.. I'll do it after Thursday.

soonhokong commented 8 years ago

I'm on it now. For now, contractor_status will include b : Box, output : ibex:BitSet, and used_constraints : set of constraints. Later, I'll add clause_manager and proof_manager to integrate learning and proof generation.

soonhokong commented 8 years ago

@clhuang and @dzufferey:

I've just pushed refactored contractor code to the master branch. Here are some comments:

--multiprune is working but --multiheuristic is not working yet. (anyway, we need to solve the thread-safety issue first). For now, I've commented out the implementation of --multiheuristic.
In SizeGradAsinhBrancher::score_axes function in icp/brancher.cpp, I found that there are cases where gradout is set to be an empty interval. This case was not covered previously. For now, this case adds 0 to the score. If you have a better patch, please feel free to add that.
In multiprune_icp::solve function in icp/icp.cpp, the previous version didn't cover the case where an intermediate pruning result is an empty set. For now, I'm checking whether first or second is empty when we add them to the stack. But I think we can have a better version. If you have an idea, please share it (or fix it).

@scungao: I'll rebase dev branch tomorrow. It seems that there are more things to do for that.

dzufferey commented 8 years ago

Thanks @soonhokong

When I have some spare cycles, I'll start looking into how to adapt the proof/interpolation infrastructure to the new contractors.

About the multiprune, checking before pushing on the stack avoids putting empty boxes on the stack that should make it a little bit faster. For the loop doing the branches, it should be correct even if the boxes are empty. hull and intersect are defined on empty boxes.

Looking again at the code, I can think of two additional optimizations:

breaking out of the loop starting at line 161 when the score is 0.0, i.e., first and second are both empty. Right now, it continues to try the other branches even if it is already empty.
on line 162, instead of bisecting original, we can bisect the hull of first and second. The code would look something like:
```
original = first;
original.hull(second);
tuple<int, box, box> const splits = original.bisect_at(dim);
```
then we can remove the hull operation on line 176.

soonhokong commented 8 years ago

@dzufferey, thanks for the comment. I'll make a patch, open a pull-request. Please review the PR when it's ready.

how to adapt the proof/interpolation infrastructure to the new contractors.

Great. I'll make a proof-object class and re-introduce what we had in dReal2. We can start from there and improve things.

dreal / dreal3

--multiheuristic option shows non-deterministic behaviors in return #237