Possible Bug in ICP? - Githubissues

I got a benchmark from Stefan Kiefer at Oxford:

(set-logic QF_NRA)
(declare-fun x_0 () Real)
(declare-fun x_1 () Real)
(declare-fun x_2 () Real)
(declare-fun x_3 () Real)
(declare-fun x_4 () Real)
(declare-fun x_5 () Real)
(declare-fun x_6 () Real)
(declare-fun one () Real)
(assert (<= x_0 1.0))
(assert (<= x_1 1.0))
(assert (<= x_2 1.0))
(assert (<= x_3 1.0))
(assert (<= x_4 1.0))
(assert (<= x_5 1.0))
(assert (<= x_6 1.0))
(assert (<= 0.0 x_0))
(assert (<= 0.0 x_1))
(assert (<= 0.0 x_2))
(assert (<= 0.0 x_3))
(assert (<= 0.0 x_4))
(assert (<= 0.0 x_5))
(assert (<= 0.0 x_6))
(assert
(and
 (<= (/ 3.0 45.0) (+ (* x_0 x_1) (* x_2 x_3) (* x_0 x_3)))
 (<= (/ 3.0 45.0) (+ (* x_1 x_2) (* x_3 x_4) (* x_1 x_4)))
 (<= (/ 3.0 45.0) (+ (* x_2 x_3) (* x_4 x_5) (* x_2 x_5)))
 (<= (/ 3.0 45.0) (+ (* x_3 x_4) (* x_5 x_6) (* x_3 x_6)))
 (<= (/ 3.0 45.0) (+ (* x_4 x_5) (* x_6 x_0) (* x_4 x_0)))
 (<= (/ 3.0 45.0) (+ (* x_5 x_6) (* x_0 x_1) (* x_5 x_1)))
 (<= (/ 3.0 45.0) (+ (* x_6 x_0) (* x_1 x_2) (* x_6 x_2)))
 (= (+ x_0 x_1 x_2 x_3 x_4 x_5 x_6) one)
 (<= one 1.0)))
(check-sat)
(exit)

When I run dReal with the following options, it returns SAT:

$ ./solver tokens7.smt2 --precision=0.001 --delta --proof && cat tokens7.smt2.proof                                                                                                                    sat
SAT with the following box:
            x_6 : [0.02083333333333314 , 0.02083333333333426];
            x_4 : [0.08333333333333323 , 0.08333333333333437];
            x_5 : [0.03124999999999972 , 0.03125000000000089];
            x_2 : [0.25 , 0.2500000000000012];
            x_3 : [0.03124999999999972 , 0.03125000000000089];
            x_1 : [0.06249999999999999 , 0.06250000000000113];
            x_0 : [0.02083333333333314 , 0.02083333333333427];
            one : [0.4999999999999988 , 0.5]
0: (x_0+x_1+x_2+x_3+x_4+x_5+x_6)=one    : 9.436895709313831e-15; [delta = 0.001000000000000001]
1: 0.062500<=((x_6*x_0)+(x_1*x_2)+(x_6*x_2))    : 7.216449660063518e-16; [delta = 0.001000000000000001]
2: 0.062500<=((x_5*x_6)+(x_0*x_1)+(x_5*x_1))    : 2.636779683484747e-16; [delta = 0.001000000000000001]
3: 0.062500<=((x_4*x_5)+(x_6*x_0)+(x_4*x_0))    : 3.053113317719181e-16; [delta = 0.001000000000000001]
4: 0.062500<=((x_3*x_4)+(x_5*x_6)+(x_3*x_6))    : 2.636779683484747e-16; [delta = 0.001000000000000001]
5: 0.062500<=((x_2*x_3)+(x_4*x_5)+(x_2*x_5))    : 8.049116928532385e-16; [delta = 0.001000000000000001]
6: 0.062500<=((x_1*x_2)+(x_3*x_4)+(x_1*x_4))    : 6.800116025829084e-16; [delta = 0.001000000000000001]
7: 0.062500<=((x_0*x_1)+(x_2*x_3)+(x_0*x_3))    : 4.996003610813205e-16; [delta = 0.001000000000000001]

As we see from the bottom part of the output, the errors on the literals are very small (10^-16). However when I compute the values using the assignment in Python, the errors are actually pretty big:

#!/usr/bin/env python

x_0 = 0.02083333333333314
x_1 = 0.06249999999999999
x_2 = 0.25
x_3 = 0.03124999999999972
x_4 = 0.08333333333333323
x_5 = 0.03124999999999972
x_6 = 0.02083333333333314

f1 = x_0 * x_1 + x_2 * x_3 + x_0 * x_3
f2 = x_1 * x_2 + x_3 * x_4 + x_1 * x_4
f3 = x_2 * x_3 + x_4 * x_5 + x_2 * x_5
f4 = x_3 * x_4 + x_5 * x_6 + x_3 * x_6
f5 = x_4 * x_5 + x_6 * x_0 + x_4 * x_0
f6 = x_5 * x_6 + x_0 * x_1 + x_5 * x_1
f7 = x_6 * x_0 + x_1 * x_2 + x_6 * x_2
one = x_0 + x_1 + x_2 + x_3 + x_4 + x_5 + x_6

c = 3.0 / 48.0

print '{0:10s} = {1:20.20f}, {0} - {2} = {3:20.20f} >= 0 is {4}'.format("f1", f1, c, f1 - c, f1 >= c)
print '{0:10s} = {1:20.20f}, {0} - {2} = {3:20.20f} >= 0 is {4}'.format("f2", f2, c, f2 - c, f2 >= c)
print '{0:10s} = {1:20.20f}, {0} - {2} = {3:20.20f} >= 0 is {4}'.format("f3", f3, c, f3 - c, f3 >= c)
print '{0:10s} = {1:20.20f}, {0} - {2} = {3:20.20f} >= 0 is {4}'.format("f4", f4, c, f4 - c, f4 >= c)
print '{0:10s} = {1:20.20f}, {0} - {2} = {3:20.20f} >= 0 is {4}'.format("f5", f5, c, f5 - c, f5 >= c)
print '{0:10s} = {1:20.20f}, {0} - {2} = {3:20.20f} >= 0 is {4}'.format("f6", f6, c, f6 - c, f6 >= c)
print '{0:10s} = {1:20.20f}, {0} - {2} = {3:20.20f} >= 0 is {4}'.format("f7", f7, c, f7 - c, f7 >= c)
print '{0:10s} = {1:20.20f} {2}'.format("one", one, one <= 1.0)

This is the result from the python script:

$ ./check.py
f1         = 0.00976562499999990632, f1 - 0.0625 = -0.05273437500000009714 >= 0 is False
f2         = 0.02343749999999996184, f2 - 0.0625 = -0.03906250000000004163 >= 0 is False
f3         = 0.01822916666666650129, f3 - 0.0625 = -0.04427083333333349524 >= 0 is False
f4         = 0.00390624999999994969, f4 - 0.0625 = -0.05859375000000004857 >= 0 is False
f5         = 0.00477430555555550303, f5 - 0.0625 = -0.05772569444444449610 >= 0 is False
f6         = 0.00390624999999995837, f6 - 0.0625 = -0.05859375000000004163 >= 0 is False
f7         = 0.02126736111111105637, f7 - 0.0625 = -0.04123263888888894363 >= 0 is False
one        = 0.49999999999999894529 True

The errors are around 10^-2, which is much bigger than what dReal claims (10^-16).

@danbryce, do you have any idea what's going on here?

$ ../../bin/solver stefan.smt2 --proof --delta --precision=0.001 && cat stefan.smt2.proof
sat
SAT with the following box:
           one : [0.9999614900119359 , 1];
           x_6 : [0.1929218830719585 , 0.1929603930600227];
           x_4 : [0.1339424194862116 , 0.1339809294742757];
           x_5 : [0.1321056610073776 , 0.1321441709954418];
           x_2 : [0.1286661151957076 , 0.1287046251837718];
           x_0 : [0.117727297963041 , 0.1177658079511051];
           x_3 : [0.1629731132876394 , 0.1630116232757036];
           x_1 : [0.131625 , 0.1316635099880642]
0: 0.066667<=((x_0*x_1)+(x_2*x_3)+(x_0*x_3))    : 3.164779577065347e-05; [delta = 0.001000000000000001]
1: 0.066667<=((x_1*x_2)+(x_3*x_4)+(x_1*x_4))    : 3.168946857472211e-05; [delta = 0.001000000000000001]
2: 0.066667<=((x_2*x_3)+(x_4*x_5)+(x_2*x_5))    : 3.152329865879767e-05; [delta = 0.001000000000000001]
3: 0.066667<=((x_3*x_4)+(x_5*x_6)+(x_3*x_6))    : 3.766098158561327e-05; [delta = 0.001000000000000001]
4: 0.066667<=((x_4*x_5)+(x_6*x_0)+(x_4*x_0))    : 3.190485153067857e-05; [delta = 0.001000000000000001]
5: 0.066667<=((x_5*x_6)+(x_0*x_1)+(x_5*x_1))    : 3.228007452643622e-05; [delta = 0.001000000000000001]
6: 0.066667<=((x_6*x_0)+(x_1*x_2)+(x_6*x_2))    : 3.437570302558535e-05; [delta = 0.001000000000000001]
7: (x_0+x_1+x_2+x_3+x_4+x_5+x_6)=one    : 0.0003080799045129812; [delta = 0.001000000000000001]
8: one<=1.0 : 3.850998806420592e-05; [delta = 0.001000000000000001]

$ ../../bin/solver stefan.smt2 --proof --precision=0.001 && cat stefan.smt2.proof
sat
SAT with the following box:
           one : [0.6245885997586125 , 0.625];
           x_6 : [0.07053545039892909 , 0.07094685064031648];
           x_4 : [0.07053545039892922 , 0.07094685064031659];
           x_5 : [0.006176467699647607 , 0.006587867941034942];
           x_2 : [0.1201056846593743 , 0.1205170849007617];
           x_0 : [0.1329269676996476 , 0.1333383679410351];
           x_3 : [0.1329269676996476 , 0.1333383679410351];
           x_1 : [0.09138161120243715 , 0.09179301144382452]

I checked the constraint

0: 0.066667<=((x_0*x_1)+(x_2*x_3)+(x_0*x_3))

and found that it is unsat whether I run with or without the —delta flag. This suggests a problem in realpaver because it is not showing the constraint as unsat, regardless of which termination test we are using. ICP shouldn’t even get to the termination check b/c realpaver should prove unsat with these boxes.

dreal / dreal2

Possible Bug in ICP? #15