LeventErkok / sbv

SMT Based Verification in Haskell. Express properties about Haskell programs and automatically prove them using SMT solvers.
https://github.com/LeventErkok/sbv
Other
239 stars 33 forks source link

Soft Question: Best approach for fixed size symbolic arrays #714

Closed patrickaldis closed 1 month ago

patrickaldis commented 1 month ago

I'm writing a package for expressing and solving grid puzzles like sudoku. The code requires creating a fixed $m \times n$ size grid. Then applying constraints according to the DSL. These could be things like "for all cells x, y values are distinct".

I currently have the grid data stored as a nested list, analogous to [[SInteger]]. To me having a single SArray a b to represent the grid seems more intuitive. It would mean I could easily query a cell based on the index, and write a quantified bool \ Forall index -> statement

However:

LeventErkok commented 1 month ago

You shouldn't use an SArray. Instead, use a regular Haskell array which stores symbolic values. (When the container is concrete, go with the concrete representation.)

So, you'd use something like: Array (Int, Int) SInteger.

Only use an SArray if the array itself is completely symbolic. Also, SMTLib arrays are indexed by types, not by bounds. (i.e., they're indexed by the entire domain.) This is an SMT restriction, not an SBV one.

patrickaldis commented 1 month ago

Ah okay perfect, that makes sense. How would I retrieve a value based on a symbolic index in this case? If i want to write \Forall ix -> arr ! ix .== literal 1

LeventErkok commented 1 month ago

You shouldn't be using symbolic indexes. (In that case you'd need an SArray.) Once you commit to Array, you do just your regular haskell programming. (i.e., you wouldn't do Forall, you'd do a recursive loop and put a constraint on each element)

patrickaldis commented 1 month ago

Ah ok thanks, I follow!

patrickaldis commented 1 month ago

The underlying issue I really want to solve is deciding whether two array 'cells' lie in the same connected component. This would be very nicely expressed with a symbolic index:

image

If there isn't a way to do this with Array is there a way to restrict the indices of SArray?

LeventErkok commented 1 month ago

You can use an SArray if you like. You're just making the problem a bit harder for the solver, but for these sorts of puzzles It should work just fine I suppose.

You don't really need to "restrict" the indices. (You can't do it anyhow, look at how the array logic in SMT is defined: https://smt-lib.org/theories-ArraysEx.shtml). Simply program with the indices you care about and extract the values you care about. The remaining entries will go unconstrained by the solver.

I'd still suggest trying to make it work with a regular Haskell Array, but if you really need symbolic indices then you have to use an SArray. (But the latter is costly and harder to program with.)

LeventErkok commented 1 month ago

There's a tower-puzzle solver that uses arrays filled with symbolic values: https://github.com/LeventErkok/sbv/blob/master/Documentation/SBV/Examples/Puzzles/Tower.hs

You might want to study that to see if it helps.

There's also a Sudoku solver (https://github.com/LeventErkok/sbv/blob/master/Documentation/SBV/Examples/Puzzles/Sudoku.hs) though that uses a list-of-lists..

patrickaldis commented 1 month ago

Ah thanks very much. The tower puzzle looks very useful. If I were to use SArray instead how would I go about extracting values from a SatResult? - I'd only be concerned with a certain range of indices

LeventErkok commented 1 month ago

There are several different ways to do this, but the easiest would be to use the query mode:

{-# OPTIONS_GHC -Wall -Werror #-}

module T where

import Data.SBV
import Data.SBV.Control
import Data.SBV.Tuple

example1 :: IO (Integer, Integer)
example1 = runSMT $ do
  a :: SArray (Integer, Integer) Integer <- newArray_ Nothing
  let a14 = a `readArray` tuple (1, 4)
      a25 = a `readArray` tuple (2, 5)

  constrain $ a14 + 1 .== a25

  query $ do ensureSat
             (,) <$> getValue a14 <*> getValue a25

This produces:

*T> example1
(-1,0)

Note that mixing arrays with quantified constraints (which is something you probably will end up doing if you use SArray) can lead to the solver not terminating. Consider:

example2 :: IO Integer
example2 = runSMTWith z3{verbose=True} $ do
  a :: SArray (Integer, Integer) Integer <- newArray_ Nothing
  constrain $ \(Forall i) (Forall j) -> readArray a (tuple (i, j)) .== i+j

  query $ do ensureSat
             getValue (readArray a (tuple (1, 2)))

If you run this, you'll see that z3 fails to terminate. Typically, adding quantifiers to the problems will make the logic semi-decidable, which will typically lead to non-termination or unknown answers from the solver.

This is why I'm recommending using a regular Haskell array and store symbolic values instead of SArray. Encoding your problem that way (assuming it's possible) will lead to much easier problems for the backend solver to deal with.

LeventErkok commented 1 month ago

If you want to stick to SArray, and if you have quantifiers involved, it's best to avoid SBV's Forall/Exists quantifiers, and instead rely on sAll, sAny for a limited range. For instance, let's say we want to model an array a, such that a[i] equals the i'th fibonacci number. The natural way to write this would be:

fib10 :: IO Integer
fib10 = runSMT $ do
  a :: SArray Integer Integer <- newArray_ Nothing

  constrain $ a `readArray` 0 .== 0
  constrain $ a `readArray` 1 .== 1

  constrain $ \(Forall i) ->
       let aim2 = a `readArray` (i - 2)
           aim1 = a `readArray` (i - 1)
           ai   = a `readArray` i
       in ai .== aim1 + aim2

  query $ do ensureSat
             getValue (a `readArray` 10)

But if you run it, you'll see that the solver fails to terminate. But if you only care for a limited subset, let's say the first 50 entries only, then you can write:

fib10' :: IO Integer
fib10' = runSMT $ do
  a :: SArray Integer Integer <- newArray_ Nothing

  constrain $ a `readArray` 0 .== 0
  constrain $ a `readArray` 1 .== 1

  constrain $ flip sAll (map literal [0..49]) $ \i ->
       let aim2 = a `readArray` (i - 2)
           aim1 = a `readArray` (i - 1)
           ai   = a `readArray` i
       in ai .== aim1 + aim2

  query $ do ensureSat
             getValue (a `readArray` 10)

And this will terminate with the correct answer 55 as the 10th fibonacci number. Obviously this'll produce (correct) but arbitrary results for values outside the range [0 .. 49], but sounds like bounded ranges are all you care about to start with.

To summarize: Use concrete Haskell data-structures so long as you can, storing symbolic values as elements. If you can't do that, make sure your quantification is bounded, i.e., only applies to a finite range and concrete range of the data-structure. If you can't do that either, you can still express the problem in SBV, but an SMT solver is just not the right tool for the kind of problem you're modeling. (Use a proper theorem prover instead.)

patrickaldis commented 1 month ago

Ah I see, yeah concrete data structures would make more sense. Your first example I managed to make work using .=> :

example2 :: IO Integer
example2 = runSMTWith z3 {verbose = True} $ do
  a :: SArray (Integer, Integer) Integer <- newArray_ Nothing
  constrain $ \(Forall i) (Forall j) -> onBoard i j .=> readArray a (tuple (i, j)) .== i + j

  query $ do
    ensureSat
    getValue (readArray a (tuple (1, 2)))

onBoard :: SInteger -> SInteger -> SBool
onBoard x y =
  sAnd
    [ x .> 0,
      x .< 10,
      y .> 0,
      y .< 10
    ]

However the fibonacci example doesn't:

fib10 :: IO Integer
fib10 = runSMT $ do
  a :: SArray Integer Integer <- newArray_ Nothing

  constrain $ a `readArray` 0 .== 0
  constrain $ a `readArray` 1 .== 1

  constrain $ \(Forall i) ->
    let aim2 = a `readArray` (i - 2)
        aim1 = a `readArray` (i - 1)
        ai = a `readArray` i
     in sAnd [i .>= 2, i .< 50] .=> (ai .== aim1 + aim2)

  query $ do
    ensureSat
    getValue (a `readArray` 10)

Concrete data structure seems simpler.

LeventErkok commented 1 month ago

Quantifiers are handled via e-matching and a slew of heuristics. When they'll kick in and when they won't is very hard to predict. Even different versions of z3 (or any other solver) might exhibit differing behavior. Try avoiding quantifiers as much as you can, unless absolutely necessary.

LeventErkok commented 1 month ago

Guess we can close this ticket now. Feel free to reopen if you have further issues.