LeventErkok / sbv

SMT Based Verification in Haskell. Express properties about Haskell programs and automatically prove them using SMT solvers.
https://github.com/LeventErkok/sbv
Other
243 stars 34 forks source link

Non-model variables #208

Closed rwbarton closed 8 years ago

rwbarton commented 8 years ago

This is a feature request, though it's possible I've missed some existing functionality which provides this.

I want to solve problems of the form:

"Find all (v1, ..., vn) such that there exist (e1, ..., em) such that P(v1, ..., vn, e1, ..., em)."

I don't care about the values of the variables ej in a satisfying assignment, and for a particular solution (v1, ..., vn), there may be (even infinitely) many values of (e1, ..., em) that satisfy the predicate P. For example, the (v1, ..., vn) may determine the structure of a graph, and the (e1, ..., em) may represent a witness of some property of the graph, such as connectedness, which is most easily encoded by taking advantage of nondeterminism. I want to generate each connected graph just once.

I would like to use the very convenient allSat, but in order to do so I need to tell it, when generating clauses that reject previously found solutions, to include only the variables v1, ..., vn and not the variables e1, ..., em. I hoped that this might be the difference between "free" variables and "existential" variables, but that seems not to be the case.

rwbarton commented 8 years ago

It might be more flexible to just give more control when calling allSat. For example I might build a single predicate of two variables x and y and be interested in all the possible values of x, and also (separately) all the possible values of y. I'm not sure how to talk about variables in the interface of allSat though.

LeventErkok commented 8 years ago

Interesting idea. So, you want allSat; but when we construct the refuting model to send back to the solver for subsequent calls; you only want certain variables to be refuted, while leaving the others to be free. Some sort of a notion of "interesting" variables. For instance, if you run this:

    allSat $ \x y -> x .> 0 &&& y .>= 0 &&& (x + y .>= (5::SInteger))

You only want to see one model per unique value of x; but don't care about duplicated ys. (i.e., you want to see (5, 1) as a model, but not (5, 2).

Is that what you are after?

rwbarton commented 8 years ago

Yes, that's exactly right. For now I've reimplemented the logic of allSat myself, but my reimplementation seems to be slower when there are a lot of previously found models to reject. (I haven't yet tried a direct comparison on a problem without any of these uninteresting variables.)

TomMD commented 8 years ago

I haven't read the above too carefully, but it does sound to me like Yices's exists/forall solver might be interesting (http://smt2015.csl.sri.com/wp-content/uploads/2015/06/2015-Dutertre-Solving-Exists-Forall-Problems-With-Yices.pdf). Perhaps SBV could add support for... not fully mixed quantification but special cases such as what Yices now supports?

LeventErkok commented 8 years ago

@rwbarton We can indeed add support for such uninteresting variables; but I suspect whatever we consider "uninteresting" now will not be sufficient for the next person who needs something like this. And it does sound like you already have a way of "iterating" over each call and adding enough constraints to discard what you consider uninteresting for that particular domain. I'm also reminded of a similar example that already exists in the repo: https://hackage.haskell.org/package/sbv-5.8/docs/Data-SBV-Examples-Misc-ModelExtract.html It only relates to one variable there, where we consider interesting to be differing by at least 5, an arbitrary measure. I'm guessing you're doing something similar.

Regarding performance: It's not going to be any better or worse than what allSat gives you; assuming the construction of the model refutation part is relatively simple. All allSat does is to re-run the problem, with the additional refutation of the previous models; which amounts to what you're doing, where "uninteresting" is defined to mean "at least different in one variable."

Unless SMT solvers adopt this sort off "all-models" command, I don't think we can do any better. The only solver that I know has some support for all-models is alt-ergo; though SBV doesn't talk to it directly yet. You might want to look into that if the performance becomes an issue and we can add a bridge for it.

Let me know if the above example is sufficient to address your needs.

LeventErkok commented 8 years ago

@TomMD Bruno's paper is indeed interesting, and something I'd like SBV to take advantage of if possible; though it's not directly related to the problem @rwbarton wants to address. In the latter case, all the variables are existential, just some of them are not really interesting. In Bruno's case, there's a strict alternation of universals coming right after existentials.

The so called existential-universal problem arises all the time in optimization as well; and is actually precisely how the Quantified version of SBV's optimize routines work: https://hackage.haskell.org/package/sbv-5.8/docs/Data-SBV.html#g:46 Of course, SBV simply invokes check-sat and I suspect the solver itself might recognize internally that what it received is an instance of the EF-problem and use the appropriate strategy. If that's not already happening, then we can probably use SBV's satCmd option (see: https://hackage.haskell.org/package/sbv-5.8/docs/Data-SBV.html#g:51) to trigger that solver. So, this feature might already be automatically available via SBV. If you give it a shot and find that it's not working, then patches are always welcome!

rwbarton commented 8 years ago

where we consider interesting to be differing by at least 5, an arbitrary measure. I'm guessing you're doing something similar.

Well, no not really. I'm doing exactly what I said in my original post. Instead of a problem of the form

Find all tuples (v1, ..., vn) such that:                       P(v1, ..., vn)

I am interested in problems of the form:

Find all tuples (v1, ..., vn) such that: exists (e1, ..., em). P(v1, ..., vn, e1, ..., em)

This doesn't require any more power from the underlying solver, and I found it a little surprising that "existential" variables don't work like this already. It seems like a particularly natural thing to want.

On the subject of performance: it's not a real problem for me; but my problems typically have thousands of variables with the number of constraints being a smallish constant factor longer, so the total time to find n solutions quickly becomes quadratic; and the constant factors are much worse when pushing the constraints rejecting old solutions through the whole system--I think allSat is smart about only building the SMTLib string to reject each solution once, so the only quadratic parts are in printing the string and inside the solver itself, which is typically a lot more efficient than sbv :)

The real performance gains would come from an interface to interactive mode, so the solver doesn't have to start from scratch for each solution... any thoughts on that?

LeventErkok commented 8 years ago

I'm not sure what you mean by existential variables don't work like this already. Can you give a concrete example?

Re performance: Quite right. All-sat is smart about not reconstructing the problem, but just adding the refuting model. However, the solver is started from scratch and typically it's the solver time that dominates the computation, not problem construction time. But I can see that could be an issue.

I'm not opposed to adding an interesting modifier for variables so allSat can simply leave those out when creating the refuting models. We can add it as a solver-option so it wouldn't break any existing code. Would that work for you?

Re: incremental interface: That's a whole another beast and I'm hesitant to tackle; should've been built from day-one I think. It's hard to graft it on. One reason is that to use incrementality in its full power, SBV needs to talk to the solvers and ask queries/get-answers etc.; and every solver seems to behave slightly differently. As the SMT-Lib standard matures, however, one can imagine adding a standardized interface. Perhaps that should be a separate library built on top of SBV to keep it clean.

rwbarton commented 8 years ago

I'm not sure what you mean by existential variables don't work like this already. Can you give a concrete example?

I'm talking about the difference between free and exists. I expected free to create a free variable like my vi, and exists to create an existentially-quantified variable like my ej. I understand that that would not be backwards compatible at this point though.

LeventErkok commented 8 years ago

No; free creates a universally quantified variable in a prove context; and an existentially quantified variable in a sat context. So things like this work well:

Prelude Data.SBV> let f x y = x + y .== (0::SInteger)
Prelude Data.SBV> sat f
Satisfiable. Model:
  s0 = 0 :: Integer
  s1 = 0 :: Integer
Prelude Data.SBV> prove f
Falsifiable. Counter-example:
  s0 = 1 :: Integer
  s1 = 0 :: Integer

That's the whole motivation behind free. You can of course always use forall and exists directly too; though then you lose succinct descriptions like the above.

So "free" doesn't really mean "auxiliary" in this particular sense; but rather "pick the quantifier accordingly." Whether this was the right design choice is up for discussion of course, but it's sort of baked-in by now.

LeventErkok commented 8 years ago

@rwbarton Just committed a simple change to accommodate for such variables. Here's an example:

t = allSat $ do
       x <- free "x"
       y <- free "y"
       constrain $ x .>= 0
       constrain $ x .<= 2
       return $ x - abs y .== (0 :: SInteger)

This produces:

*Main> t
Solution #1:
  x = 0 :: Integer
  y = 0 :: Integer
Solution #2:
  x =  1 :: Integer
  y = -1 :: Integer
Solution #3:
  x =  2 :: Integer
  y = -2 :: Integer
Solution #4:
  x = 2 :: Integer
  y = 2 :: Integer
Solution #5:
  x = 1 :: Integer
  y = 1 :: Integer
Found 5 different solutions.

But, if we say:

t = allSatWith z3{nonModelVars = ["y"]} $ do
       x <- free "x"
       y <- free "y"
       constrain $ x .>= 0
       constrain $ x .<= 2
       return $ x - abs y .== (0 :: SInteger)

Then we get:

*Main> t
Solution #1:
  x = 0 :: Integer
Solution #2:
  x = 1 :: Integer
Solution #3:
  x = 2 :: Integer
Found 3 different solutions.

Is this a satisfactory solution for your needs? I'm open to other design options to make sure it handles your case well. Note that we don't even print the value for y n the output here, but we still keep it in the bindings when you extract the dictionary and use it programmatically. So, if you're extracting them, you'll still get the witness values as usual. I think that could still come in handy, so I don't see any point in dropping them from the models completely.

LeventErkok commented 8 years ago

Example: http://github.com/LeventErkok/sbv/blob/master/Data/SBV/Examples/Misc/Auxiliary.hs

I'll make a new release once @rwbarton OKs this design/solution.

rwbarton commented 8 years ago

This works for me, thanks! I might be using it in a slightly unintended way, by giving all my non-model variables the same name (say "nonmodel"), then specifying that single name in nonModelVars. Easier and more efficient than threading thousands of distinct variables names out of my problem generation and into the allSatWith call.

rwbarton commented 8 years ago

Oh, and one of my tests (generating all of the 1072 Hamiltonian cycles in a 6x6 grid) now runs in ~12 minutes, rather than ~30 minutes. Still painfully quadratic, but certainly a large improvement!

LeventErkok commented 8 years ago

Nice..

Using duplicate names is sneaky; the fact that it works is because I didn't think about checking for duplicates :-) I don't think anything really relies on names being unique, but I can imagine it being at least confusing when models are printed. And if you ever need those witness variables, they'd be hard to extract from the dictionary.

Perhaps a better design is to make nonModelVars a String -> Bool function; so you can still name them distinctly, but simply pass a isPrefixOf test. That sounds more robust. What do you think?

rwbarton commented 8 years ago

Yes, I think that's probably a better way to go.

LeventErkok commented 8 years ago

Great; I'll make a release after that change later tonight.

@rwbarton If you could tell me your full name (your github profile doesn't have it so far as I can see), I'd like to put it into the ack list..