dreal / dreal4

Automated Reasoning in Nonlinear Theories of Reals
https://dreal.github.io
Apache License 2.0
153 stars 33 forks source link

Standardize output for check-sat #217

Closed LeventErkok closed 4 years ago

LeventErkok commented 4 years ago

To be able to integrate dreal to upstream tools, it's really important that it responds in a standard way. For this benchmark:

(set-option :global-declarations true)
(set-option :produce-models true)
(declare-fun s1 () Int); (assert (= s1 0))
(declare-fun s0 () Int)
(declare-fun s2 () Bool); (assert (= s2 (= s0 s1)))
(assert (not s2))
(check-sat)

dreal produces:

delta-sat with delta = 0.001
s1 : [0, 0]
s0 : [0, 0]
s2 : False

Unfortunately, this is difficult to deal with for an upstream tool: In general all responses should either be atoms (such as delta-sat, or fully parenthesized S-Expressions.

For the above, I suggest responding with:

(delta-sat  0.001)

and forgoing the model printing completely. (Tools should be retrieving these later with a get-value command.)

I can see that perhaps you wouldn't want to change the default behavior since you may worry about breaking other tools out there that rely on what you're printing currently. If that's the case, it'd be great if you can tie this new behavior to a command line switch; something like:

dreal --check-sat-short-response

which would then simply print:

(delta-sat 0.001)

without the model. Of course, feel free to pick any command-line switch name that you like.

soonho-tri commented 4 years ago

FYI, #219 introduces a new option, --smt2-compliant. You can also specify it in an smt2 file as follows:

(set-option :smt2-compliant true)
soonho-tri commented 4 years ago

@LeventErkok , I'll release a new version later. You can use the latest commit on a Mac by running:

brew rm dreal; brew install dreal --HEAD
LeventErkok commented 4 years ago

@soonho-tri Thanks! But this isn't quite working as expected. For the above example, I'm getting:

$ dreal a.smt2 --smt2-compliant
delta-sat
s1 : [0, 0]
s0 : [0, 0]
s2 : False

The point is that the lines for s1/ s0 / s2 should not be printed.

Also, I think printing the threshold is a good idea with delta-sat to make sure the upstream solvers know about it explicitly. An output of the form:

(delta-sat 0.001)

would be most useful. (Or something that's an s-expression and has all the info you want to include that relates to delta-sat parameters.)

LeventErkok commented 4 years ago

@soonho-tri

Looking at your code, I think the confusion is due to the presence of the command:

(set-option :produce-models true)

In general, this SMTLib command simply tells the solver to produce models and be ready to be queried for them, but not necessarily print them. (At least that's how almost all solvers interpret it.) See pg. 56 of http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-18.pdf

My recommendation would be to tie model printing to a separate command-line argument if necessary, and the above command should be decoupled: i.e., the presence of (set-option :produce-models true) should not cause the solver to print a model, but allow further (get-value) and (get-model) commands to be issued. (In other words, if the produce-models is not given, you can reject the commands get-value and get-model afterwards. Though not all solvers enforce this.)

LeventErkok commented 4 years ago

@soonho-tri Would love to test this out but looks like the merge is blocked?

soonho-tri commented 4 years ago

It's just merged.

LeventErkok commented 4 years ago

@soonho-tri

Thanks! Unfortunately this isn't quite working yet, as you need to flush the buffer after printing delta-sat. I think you can fix it by changing:

http://github.com/dreal/dreal4/blob/master/dreal/smt2/driver.cc#L103

to read:

cout << "delta-sat" << endl;

which makes sure the buffer is flushed.

In general when used at the end-of-a pipe you want to flush the output everytime you print a full response. Otherwise the upstream tools that listen to dReal at the end of a text-pipe simply hang.

soonho-tri commented 4 years ago

Got it. I've just pushed 1475ef3 to fix it. Can you test that?

LeventErkok commented 4 years ago

Looks good! Thanks..