dreal / dreal3

There is a new version of dReal, available at https://github.com/dreal/dreal4
GNU General Public License v3.0
48 stars 36 forks source link

Benchmark 710 of Flychecker reported SAT on your Website, but UNSAT following Mathsat and Z3 #209

Closed zell08v closed 8 years ago

zell08v commented 8 years ago

Hello, I have a question about your benchmark result published at http://dreal.cs.cmu.edu/kepler/

The benchmark 710 (of Flychecker) is reported 'sat', but I have tried Z3 and MathSAT on the same benchmark. They both report that 710.smt2 is in fact 'unsat'. Maybe I misunderstand something? Thanks.

soonhokong commented 8 years ago

@zell08v, try to run the instance with --polytope option, it should give UNSAT within 0.01 sec.

soonhokong commented 8 years ago

For instances only including higher-order polynomials, it's better to run with --polytope option. If you do not use --polytope option, you need to use --precision 0.000001 to get UNSAT but it will take quite a lot of time since it's not using polytope contractor.

zell08v commented 8 years ago

Thanks. Then, why on your webpage it is said 'sat'? Below is the screenshot from your Kepler site: http://dreal.cs.cmu.edu/kepler/:

b6e1c65e-6b42-49d2-ad21-bc3b1b99dcc5
zell08v commented 8 years ago

I meant: if 710.smt2 is actually unsat, why it is marked 'sat' on your web? Did I misunderstand something?

soonhokong commented 8 years ago

The webpage is outdated, we will update it soon.

Also, please understand that sat in dReal really means delta-sat, that is, a numeric perturbation of the formula is satisfiable. In many cases, It's possible to have a delta-sat by using a big enough precision value.

zell08v commented 8 years ago

Thanks for your explanation. I will then check out your webpage later.

zell08v commented 8 years ago

Thanks. I have a naive question. What are typical things dReal can do but Z3 or MathSAT cannot? For the benchmark I mentioned above at least, they both can prove UNSAT within 0.01 seconds on my laptop. I guess it is useful that you use them to cross-check your benchmark results.

scungao commented 8 years ago

Try benchmarks such as https://github.com/dreal/dreal3/tree/master/benchmarks/smt2/microfluidics https://github.com/dreal/dreal3/tree/master/benchmarks/smt2/with_odes

zell08v commented 8 years ago

Thanks a lot. I have just tried some of your provided benchmarks but unfortunately I do not yet find one that Z3 or Mathsat parses. It seems the operator '^' (power) or QF_NRA_ODE logic is not supported by them. (They should not be in the SMT2 standard, anyway.)

So, the comparison based on these benchmarks can be unfair. To reformulate my question: are there benchmarks that both dReal and Z3/Mathsat support and can parse, yet dReal performs significantly better? Thanks for your clarification.

scungao commented 8 years ago

We focus on highly nonlinear problems which most solvers don't handle right now. The majority of them are non-polynomial, but if you have big polynomial expressions dReal may have an edge, for instance https://github.com/dreal/dreal3/blob/master/benchmarks/smt2/nikos_01.smt2 and the ones similar to it. In general we pay more attention to benchmarks from realistic applications (robotics, biology, AI), and less to those generated merely for benchmarking purposes. We rarely work on optimizing performance on simpler theories. You could just try all solvers and pick what's suitable for your application.