LeventErkok / sbv

SMT Based Verification in Haskell. Express properties about Haskell programs and automatically prove them using SMT solvers.
https://github.com/LeventErkok/sbv
Other
243 stars 34 forks source link

Is it possible to reuse previously defined variables? #613

Closed toonn closed 2 years ago

toonn commented 2 years ago

I'm looking to use an SMT solver for an exercise of last year's Advent of Code. SBV interests me because in principle it'd allow testing multiple solver backends with trivial code changes. (Currently only the Z3 backend supports optimization so I won't benefit from the abstraction but maybe for future tasks.)

In the original problem I have a list of instructions and I want to fold over these instructions to generate constraints and then tack on a maximization at the end. The trouble I have is I need to refer to previously introduced variables in subsequent constraints and in the formula to maximize.

My first approach was to just use sInteger with the same string as argument, assuming that'd introduce the same symbolic variable but this just introduces new variables suffixed with _N.

The next approach was to keep track of the variables introduced in a list and reuse the values from that list. But this is just a different way of writing the sInteger calls multiple times so it runs into the same problem.

I'm including a Minimal Reproducing Example here that shows both these approaches to avoid getting bogged down in the particulars of the AoC problem.

import Data.SBV

optimizationProblem :: Goal
optimizationProblem = do
  x <- sInteger "x"
  y <- sInteger "y"
  constrain $ inRange x (1,10)
  constrain $ inRange y (1,10)
  constrain $ x + y .== 10
  maximize "Product" $ x * y

constraints1 :: Goal
constraints1 = do
  x <- sInteger "x"
  y <- sInteger "y"
  constrain $ inRange x (1,10)
  constrain $ inRange y (1,10)
  constrain $ x + y .== 10

goal1 :: Goal -> Goal
goal1 cs = cs *> do x <- sInteger "x"
                    y <- sInteger "y"
                    maximize "Product" $ x * y

goal1' :: Goal -> Goal
goal1' cs = cs *> do x <- sInteger "x"
                     y <- sInteger "y"
                     pure ()

type Variables = Symbolic (SInteger, SInteger)

variables :: Variables
variables = do
  x <- sInteger "x"
  y <- sInteger "y"
  pure (x,y)

constraints2 :: Variables -> Goal
constraints2 vars = do
  (x,y) <- vars
  constrain $ inRange x (1,10)
  constrain $ inRange y (1,10)
  constrain $ x + y .== 10

goal2 :: Variables -> (Variables -> Goal) -> Goal
goal2 vars cs = cs vars *> do (x,y) <- vars
                              maximize "Product" $ x * y

goal2' :: Variables -> (Variables -> Goal) -> Goal
goal2' vars cs = cs vars *> do (x,y) <- vars
                               pure ()

main :: IO ()
main = do
  -- The result I get is x = 4, y = 6 which seems wrong?
  -- I did not constraint x and y to be distinct so the answer I expect is
  -- x = y = 5, x * y = 25, which is greater than 24
  -- Can't say I understand what's going on in what's supposed to be a trivial
  -- example even.
  answer <- optimize Lexicographic optimizationProblem

  -- These seem to never finish, expected both x_0 and y_0 to be oo
  -- answer1 <- optimize Lexicographic (goal1 constraints1)
  -- answer2 <- optimize Lexicographic (goal2 variables constraints2)
  --
  -- Using sat we see that new variables are created in the goalN, x_0 and y_0
  answer1 <- sat (goal1' constraints1)
  answer2 <- sat (goal2' variables constraints2)
  putStrLn $ concat [ "Desired outcome (kind of):\n"
                    , show $ answer
                    , "\n\nAnswer 1:\n"
                    , show $ answer1
                    , "\n\nAnswer 2:\n"
                    , show $ answer2
                    ]

As a side note I've also mentioned in one of the comments, the optimization gives me x = 4 and y = 6 as a result but it seems to me the proper result would be x = y = 5. Are there implicit distinctness constraints on variables?

toonn commented 2 years ago

Since looking at the output isn't always possible, as for answer1 and answer2 in the example, is there a way to show the input. That would've made it easier for me to find out I'm creating new variables rather than reusing the ones I intended.

toonn commented 2 years ago

Currently looking at #576, I believe that answers my first question. I thought that was what I was doing with goal2 though.

So basically everything has to be in one big do-block always?

LeventErkok commented 2 years ago

@toonn

Just to make sure we're on the same page, the following is what I see when I run your program:

Desired outcome (kind of):
Optimal model:
  x       =  5 :: Integer
  y       =  5 :: Integer
  Product = 25 :: Integer

Answer 1:
Satisfiable. Model:
  x   = 1 :: Integer
  y   = 9 :: Integer
  x_0 = 0 :: Integer
  y_0 = 0 :: Integer

Answer 2:
Satisfiable. Model:
  x   = 1 :: Integer
  y   = 9 :: Integer
  x_0 = 0 :: Integer
  y_0 = 0 :: Integer

In particular, I don't see x = 4, y = 6 anywhere. Is this not what you see?

toonn commented 2 years ago

@LeventErkok, indeed I'm getting the following:

Desired outcome (kind of):                                                      
Optimal model:                                                                  
  x       =  4 :: Integer                                                       
  y       =  6 :: Integer                                                       
  Product = 24 :: Integer

Answer 1:
Satisfiable. Model:                                                             
  x   = 9 :: Integer                                                            
  y   = 1 :: Integer                                                            
  x_0 = 0 :: Integer
  y_0 = 0 :: Integer                                                            

Answer 2:                                                                       
Satisfiable. Model:
  x   = 9 :: Integer                                                            
  y   = 1 :: Integer                                                            
  x_0 = 0 :: Integer                                                            
  y_0 = 0 :: Integer

Though I just realized this is with sbv 8.9. Let me try the current version.

LeventErkok commented 2 years ago

It's most likely your z3 version that matters. I'm using:

$ z3 --version
Z3 version 4.8.15 - 64 bit

Latest official release is 4.8.14 (https://github.com/Z3Prover/z3/releases), I'm using z3 directly compiled from their GitHub master. See if you might have a version of z3 that's rather old.

toonn commented 2 years ago

That might be it Z3 version 4.8.10 - 64 bit.

LeventErkok commented 2 years ago

yeah, definitely upgrade your z3. let me know how it goes.

toonn commented 2 years ago

Yes, seems like that was a bug in Z3.

So to recap my questions:

  1. Is there an elegant way to build up a Goal from another structure? Specifically I think what I need is composition (*>) and reusing variables defined earlier.
  2. Getting an incorrect answer to the optimization problem. (Addressed, turns out Z3 was buggy.)
  3. Is there a way to inspect the concrete formulation of a Goal, to inspect whether the correct variables are being used?
LeventErkok commented 2 years ago
  1. When you're building goals all you're doing is monadic programming in the Symbolic monad. There's really no SBV specific advice here, just follow your regular Haskell programming skills. Note that every call to sat/prove/optimize happens in its own context, and thus they cannot share variables. If you have multiple calls to make, you probably want to use the query mode, see here: https://hackage.haskell.org/package/sbv-8.17/docs/Data-SBV-Control.html (Aside: Query mode doesn't work with optimization; only sat calls are available. There are technical reasons for this, but most importantly z3's optimizer isn't incremental to start with.)

  2. I think this is resolved.

  3. I'm not sure if I follow. You can always turn on the verbose mode to see what SBV generates, but I gather this isn't what you're looking for. Again, each call to sInteger/sBool etc. will create a brand-new variable, distinct from all others. As I noted above, if you want to use the same variable in multiple calls to checkSat (https://hackage.haskell.org/package/sbv-8.17/docs/Data-SBV-Control.html#v:checkSat).

I hope this addresses your questions; though it's always best to be very specific and ask about concrete cases. Feel free to do so if anything isn't clear.

toonn commented 2 years ago

I'm still having trouble with 1, why are new variables still created for goal2? And does this mean I have to parameterize every separate Goal over the input variables and then only do the composition in a singular do-block?

Re 3, I was hoping for something in between the Haskell and the verbose mode output. If I could render goal1 as something like the following it'd be a lot clearer that I'm not using the correct variables:

x in [1..10] -- (Exact representation of inRange isn't important.)
y in [1..10]
x + y == 10
maximize (x_0 * y_0)
LeventErkok commented 2 years ago

Look at the type of variables:

type Variables = Symbolic (SInteger, SInteger)

variables :: Variables

That is, variables is a computation that will produce two symbolic integers whenever it is called. Since you call it twice (once directly in goals2, and once indirectly via call to cs vars), you get two copies of each of the variables.

Note that this isn't specific to SBV; any Haskell monadic program would behave this way; you call the action twice, it gets executed twice. You probably want to restructure your program so variables is called only once (i.e., vars <- variables) and then used by different functions instead.

Re 3: There's no intermediate representation to show, unfortunately. This is typical of EDSL's (embedded domain-specific language), where the library simply uses the host-language data-structures. In theory, one can add such a pretty-printing routine, but SBV doesn't have one of those. The only output you'll see is what you get in the verbose mode. (satWith z3{verbose=True}). If this is something you're interested in, you can implement it and submit a PR, though I've to say it'll be a non-insignificant amount of work and will require you to go into the guts of how SBV represents goals.

toonn commented 2 years ago

The worst part is I kinda know this. If variables were a print or something it'd be intuitive. I guess I'm trying to "escape the monad," and I shouldn't.

Thank you for the answers, I'll close the issue because all the questions have been addressed.

LeventErkok commented 2 years ago

No worries. Note that there's a stack-overflow tag for SBV that you can also use for these sorts of questions: https://stackoverflow.com/questions/tagged/sbv

toonn commented 2 years ago

Figured I'd leave an example of how it could be done for future confused users.

import Data.SBV

optimizationProblem :: Goal
optimizationProblem = do
  x <- sInteger "x"
  y <- sInteger "y"
  constrain $ inRange x (1,10)
  constrain $ inRange y (1,10)
  constrain $ x + y .== 10
  maximize "Product" $ x * y

constraints :: Symbolic (SInteger, SInteger)
constraints = do
  x <- sInteger "x"
  y <- sInteger "y"
  constrain $ inRange x (1,10)
  constrain $ inRange y (1,10)
  constrain $ x + y .== 10
  pure (x,y)

goal :: (SInteger, SInteger) -> Goal
goal (x,y) = do
  maximize "Product" $ x * y

main :: IO ()
main = do
  answer <- optimize Lexicographic optimizationProblem
  answer' <- optimize Lexicographic (constraints >>= goal)
  putStrLn $ concat [ "Desired outcome (kind of):\n"
                    , show $ answer
                    , "\n\nAnswer:\n"
                    , show $ answer'
                    ]

I figured the State monad might make this even nicer, just geting the variables whenever they're needed but I keep running into the same problem of introducing new variables.

LeventErkok commented 2 years ago

@toonn

Different styles are always possible; and you can indeed use whatever monad to carry around variables too. Just as a side note, I'd consider the following to be the idiomatic way of writing this in SBV:

import Data.SBV

constraints :: (SInteger, SInteger) -> Symbolic ()
constraints (x, y) = do
  constrain $ inRange x (1, 10)
  constrain $ inRange y (1, 10)
  constrain $ x + y .== 10

goal :: (SInteger, SInteger) -> Goal
goal (x, y) = maximize "Product" $ x * y

main :: IO ()
main = print =<< optimize Lexicographic problem
 where problem = do xy <- (,) <$> sInteger "x" <*> sInteger "y"
                    constraints xy
                    goal        xy

But again, this is just my style and depending on other parts of the program, you might want to structure it in different ways.

toonn commented 2 years ago

My problem is that I can't create the variables ahead of time. The number of variables depends on the precise instructions in the list.

LeventErkok commented 2 years ago

Cool. I haven't solved these myself, but I did see a discussion on reddit where some people did using SBV; and I did comment on some of the questions cropped up there. Note that you can create variables as you go along, intermixed with constraints and how you process the abstract syntax; I see no problems with that. Take a look at that discussion and perhaps it'll be helpful to you: https://www.reddit.com/r/haskell/comments/rnek6d/advent_of_code_2021_day_24/

toonn commented 2 years ago

I will, once I solve the problem, don't want spoilers : )

LeventErkok commented 2 years ago

@toon Spoiler alert.. Don't click unless you want to see one way of solving this puzzle in SBV: https://gist.github.com/LeventErkok/d8d6855a92783df115abd52d702d9496

I haven't put any comments of any kind, but feel free to ask questions about it if you choose to look at it.

Best of luck!

LeventErkok commented 2 years ago

@toonn Final spolier..

I've added this example into the SBV release, so it'll be distributed on Hackage when the next release of SBV happens: https://github.com/LeventErkok/sbv/blob/58f13c937cb8d6847e84c2b1ed3be35f39897526/Documentation/SBV/Examples/Puzzles/AOC_2021_24.hs

This is a modification from the version I put in the previous gist, and I hope is more readable (there are enough comments now I hope). It also made me realize why you were asking "how to reuse previously defined variables." The trick is essentially to wrap a monad transformer over Symbolic, and carry around the extra state you need in there. That way you can keep track of which variables were allocated, and later on use them as you wish. Hope this solution is helpful to you in structuring your programs.

toonn commented 2 years ago

Thank you, I managed to get through it today. My solution can be found here in case anyone's interested.

You definitely used a cleaner style in some places but I feel like the constraints ended up fairly similar. I did switch from optimizing the model number as a whole to optimizing the digits and then composing it after because that cut the running time down to a third.

One thing, if I switch to using SInt64 instead of SInteger I get wrong answers and it takes 20 s rather than 5 s per part. Any idea why bounded integers might make it slower and incorrect? I was thinking maybe overflow?

Largest model number accepted by MONAD: 102481911520607595319691612737687
Smallest model number accepted by MONAD: 102481911520607595281684999514504
LeventErkok commented 2 years ago

SInt64 should be enough since the largest 14 digit number is 99999999999999, which is representable by Int64; so there should be no problem with overflow.

There must be some other bug (either in your code, or in SBV, or in z3) that makes it not work for SInt64. If you figure out what the issue is, do let me know.