Z3Prover / z3

The Z3 Theorem Prover
Other
10.29k stars 1.48k forks source link

Definition of all_smt from the "Programming Z3" book #5765

Closed LeventErkok closed 2 years ago

LeventErkok commented 2 years ago

I'm looking at the definition of all_smt as given in https://theory.stanford.edu/%7Enikolaj/programmingz3.html#sec-blocking-evaluations; the version that implements the clever disjoint-model approach. Here's an example use:

from z3 import *

x, y = Bools('x y')

s = Solver()
s.add(Or(x, y))

print(list(all_smt(s, [x, y])))

Alas, this prints:

[[y = False, x = True], [y = True, x = False]]

what's missing is the assignment [y = True, x = True]. (Though it's sort of redundant, but it is a valid model nonetheless.)

I thought perhaps the problem is with model completion, so I modified the calls to m.eval to account for that:

def all_smt(s, initial_terms):
    def block_term(s, m, t):
        s.add(t != m.evaluate(t, model_completion=True))
    def fix_term(s, m, t):
        s.add(t == m.evaluate(t, model_completion=True))
    def all_smt_rec(terms):
        if sat == s.check():
           m = s.model()
           yield m
           for i in range(len(terms)):
               s.push()
               block_term(s, m, terms[i])
               for j in range(i):
                   fix_term(s, m, terms[j])
               for m in all_smt_rec(terms[i:]):
                   yield m
               s.pop()
    for m in all_smt_rec(list(initial_terms)):
        yield m

Alas, this made no difference; I still don't get the [y = True, x = True] model as an output.

I'm curious as to why we don't get this third model. Not only this makes the utility of all_smt a bit unintuitive to use, it isn't clear to me why the third model isn't being produced.

aytey commented 2 years ago

Don't you want to be blocking the negation of the conjunction of the current assignment? Currently, you're blocking each constant individually (which explains why you don't get True/True).

LeventErkok commented 2 years ago

@andrewvaughanj No. This "trick" achieves that precisely by branching on one variable at a time; thus avoiding the general "block the last model" idea (which involves negation of the whole assignment). As described in the paper, this causes the solver to be "inundated with new lemmas," thus the need for a more clever solution. See the discussion in https://theory.stanford.edu/%7Enikolaj/programmingz3.html#sec-blocking-evaluations

aytey commented 2 years ago

Oh, maybe I should have read before I wrote then 🤦 ... apologies for the noise 🙊

NikolajBjorner commented 2 years ago

err,

def all_smt(s, initial_terms):
    def block_term(s, m, t):
        s.add(t != m.evaluate(t, model_completion=True))
    def fix_term(s, m, t):
        s.add(t == m.evaluate(t, model_completion=True))
    def all_smt_rec(terms):
        if sat == s.check():
           m = s.model()
           yield m
           for i in range(len(terms)):
               s.push()
               block_term(s, m, terms[i])
               for j in range(i):
                   fix_term(s, m, terms[j])
               for m1 in all_smt_rec(terms[i+1:]):
                   yield m1
               s.pop()
    for m in all_smt_rec(list(initial_terms)):
        yield m

Unless I got the logic all wrong here (very possible), the issue is in lexical scope / dynamic scope: replace m by m1 in the for loop. Seems lexical scope is out the window. It is also possible to use all_smt_rec over terms[i+1:].

LeventErkok commented 2 years ago

Ah, that's it! Thanks so much.

NikolajBjorner commented 2 years ago

Interesting with the "book" reference. It is just a chapter in a book dedicated to a spring school in Chongqing. When walking back from lunch during the school, Jonathan Bowen, asked that I contribute with a chapter to supplement lectures. A modest "40-50" pages would be good was what I heard. No escape with that half-page abstract or four page quota to make it into main contents.

LeventErkok commented 2 years ago

From a programmers perspective, it's one of the best writings you can get your hands on regarding z3, but I don't have to tell you that! :-)

It gets referred to quite often in stack-overflow and other contexts, and I definitely benefitted from it quite a bit personally. Thanks for putting in the time to write it up. And one day, who knows, it might turn into a book too; with worked out exercises and what not. The community would definitely appreciate that. (Of course, the resources/time needed to do something like this will be significant.)

NikolajBjorner commented 2 years ago

Yes, I ran into Jonathan in Oxford during FLoC and he reminded me to finish the chapter. It cost me some weeks.

It is definitely dense and there are areas the text is not aimed at: examples of how to specify, algorithmic internals of the solvers. It is potentially for two different audiences and example material is covered in other texts (such as Dennis'). The current issue seems to be that it is taken as a main guide for everything so students who are new to SMT and z3 will get lost.

LeventErkok commented 2 years ago

It's the second google hit for z3 tutorial (https://www.google.com/search?client=safari&rls=en&q=z3+tutorial&ie=UTF-8&oe=UTF-8)

And there is a dearth of reliable/good/readable introductions to z3. The old web-site with try-me buttons were useful, though I understand it's decommissioned now to never come back. Something that merges the material in there with the "Programming Z3" contents would be nice indeed, though again the cost of doing so doesn't sound like something you can just do in your free time here and there.. Would be best if there was a corporate sponsor to let you do this, though I know it's probably a long shot.

NikolajBjorner commented 2 years ago

he old web-site with try-me buttons were useful, though I understand it's decommissioned now to never come back.

No, the plan is to corpse revive by summer to a shinier state. There is an internship project dedicated around this subject. The in-your-browser experience may be shifted to wasm for SMTLIB text at least. There are some weekend stabs at this. They have to be upgraded to a state where the tutorial material builds as part of CI. Online Jupyter notebooks are tougher because they require some levels of indirection and hosted solutions have changed and been decommissioned over time. Notebooks can still be shared and run after local installation and maybe used with some variant of github.dev (which is bound to change over time). The wasm front end opens up a potentially different avenue, which is in-browser client-side programmatic use cases (but has to be weighed with dealing with different moving parts). A nice goal would also be to enable users to share content without hitting privacy issues.

LeventErkok commented 2 years ago

@NikolajBjorner

I've been digging into some Python idioms, and apparently the "correct" way to write:

for m1 in all_smt_rec(terms[i+1:]):
      yield m1

is

yield from all_smt_rec(terms[i+1:])

This not only avoids the variable completely, it is apparently safer too. (Details: https://stackoverflow.com/a/26109157/936310)

For the next edition of the book! :-)

NikolajBjorner commented 2 years ago

Thanks, online version updated.

LeventErkok commented 2 years ago

I believe the correct form is yield from (note the keyword from.)

And as long as you're fixing the online version, there's a semicolon at the end of def_block line, (second definition in that section), which should be a colon.

deut-erium commented 2 years ago

Hi, I think you need to fix it to yield from all_smt_rec(terms[i:])

e.g.

v = BitVec('v',3)
s = Solver()
s.add(v!=0)
print(list(all_smt(s,[v]))
# [[v = 1], [v = 5]]

Which misses the point

def all_smt(s, initial_terms):
    def block_term(s, m, t):
        s.add(t != m.eval(t, model_completion=True))
    def fix_term(s, m, t):
        s.add(t == m.eval(t, model_completion=True))
    def all_smt_rec(terms):
        if sat == s.check():
           m = s.model()
           yield m
           for i in range(len(terms)):
               s.push()
               block_term(s, m, terms[i])
               # term[i] should not be the same as that in m
               for j in range(i):
                   fix_term(s, m, terms[j])
               yield from all_smt_rec(terms[i:])
               # we are yet to discover all the assignments for term[i]. Using term[i+1:] means we are skipping
               # past the first satisfying assignment of term[i]
               # Note that term[i] might be multivalued and not binary
               s.pop()   
    yield from all_smt_rec(list(initial_terms))
LeventErkok commented 2 years ago

It appears terms[i+1:] is indeed a bad optimization. I'm reopening the ticket so Nikolaj can take a look and close when he deems appropriate.

NikolajBjorner commented 2 years ago

you beat me to it. Yes.

NikolajBjorner commented 2 years ago

There are some other ways to do all_smt:

The latter could have some advantages such as avoiding many distinct atoms that the approach from this thread has. Clemens Eisenhofer added an example where the propagator is used to block N-queens solutions. It is in C++ in the examples directory. There is also a python API for the propagator.