GaloisInc / saw-script

The SAW scripting language.
BSD 3-Clause "New" or "Revised" License
433 stars 63 forks source link

Uninterpreted higher-order functions #906

Open msaaltink opened 3 years ago

msaaltink commented 3 years ago

SAW could be more informative in its error message here. With these definitions

let {{
  twice: ([8] -> [8]) -> [8] -> [8]
  twice f x = f (f x)

  add1: [8] -> [8]
  add1 x = 1+x
  }};

prove (w4_unint_z3 ["twice"]) {{ \x -> twice add1 (1+x) == twice add1 (x+1) }};

we get the error message

Cannot create uninterpreted higher-order function

In this case that function is obviously twice, but in a more complicated scenario with many uninterpreted names it can take a while to find the culprit. Surely SAW could print the name of the offending function here.

Even better, though, would be to support uninterpreted higher-order functions, similarly to the way uninterpreted polymorphic functions were recently supported.

The work around I use for now involves a cover function, e.g.,

let {{ twice_add1 = twice add1 }};

with a rewrite rule

twice_add1_thm <- prove_print z3 {{ \x -> twice add1 x == twice_add1 x}};

then simplifying with this rule before generating the goal

prove do {simplify (addsimp twice_add1_thm empty_ss);  w4_unint_z3 ["twice", "twice_add1"]; }
    {{ \x -> twice add1 (x+1) == twice add1 (1+x) }};

While this is bearable for a small proof like this, it is a bit tricky in a more complex proof, and really distracts from the more interesting steps needed to get a proof through. For example, if twice add1 appears in the body of some other functions, we need to be sure to unfold all those definitions before doing the rewrite.

brianhuffman commented 3 years ago

Adding the function name to the error message will be very easy. I can make a PR for this right away.

I've thought about supporting uninterpreted higher-order functions. There are a couple of reasons why this is trickier to implement than supporting uninterpreted polymorphic functions. To support polymorphic functions, we only need a way to map from a pair of (function name, list of type arguments) to an uninterpreted function symbol. This is pretty easy to do because we can fully inspect the type argument values and compare them for equality. For higher-order functions like map or twice, we would need a way to go from (function name, list of function arguments) to an uninterpreted function symbol, but we don't have a good way to inspect the function arguments or compare them for equality. With the way the saw-core backend framework is designed, we don't currently have access to the syntax of the function arguments; we only get to see their semantic values (monadic haskell functions from input values to output values). So one approach would be to redesign the saw-core backends to preserve the syntax of higher-order function arguments, and then inspect the syntax when constructing uninterpreted function symbols.

Another approach would be to ignore the higher-order function arguments completely, and just generate a fresh uninterpreted function symbol whenever we encounter a sufficiently-applied higher-order function. We could rely on the observable subterm sharing to ensure that multiple occurrences of the same higher-order function applied to the same function argument would map to the same uninterpreted function symbol in the backend. This might be a bit too fragile, though: Beta-reduction would probably affect whether or not two uses of the same higher-order function were unified or not. For example, if a larger term contained both (\f -> twice f x) add1 and twice add1 x as subterms, then each application of twice would probably end as a different uninterpreted function symbol, which is not what we want.

msaaltink commented 3 years ago

You write "we don't have a good way to inspect the function arguments or compare them for equality", and yet, when the rewrite rule applies, some version of this must be happening. But this is perhaps very weak. In the cases I'm encountering, the higher-order argument is always a named function but I can appreciate that this would not always be the case.

The example of (\f -> twice f x) add1 is interesting; I'm not sure what I'd expect to happen with twice uninterpreted. Certainly the rewriting work-around would not apply unless beta-reduction was done first.

brianhuffman commented 3 years ago

Right, the rewriter is operating purely on syntax, and it does syntactic matching. But the uninterpreted functions are being created in the context of an evaluation backend, where all function arguments are converted from Term to Value before the function gets to see them. The design that we chose for function values is the constructor VFun, which contains something like Value -> M Value (where M is the evaluation monad). Inspecting function values would be slightly easier if we had represented function values as explicit closures, which would contain the syntax of the body of the function. I hope that makes sense.

msaaltink commented 3 years ago

That does make sense. I can certainly live with things as they are now, but it makes some of the proofs uglier than I'd like.

brianhuffman commented 3 years ago

This is definitely a feature that I'd like to support eventually; it just might take a while before we get there.

brianhuffman commented 3 years ago

I have a renewed interest in getting this implemented now, because uninterpreted higher-order functions would be important for doing faithful round-trip translations back and forth between saw-core and what4. This kind of round-tripping is useful not just for goal_eval, but also for improving the connection with crucible during symbolic simulation: There are a bunch of performance bottlenecks and other issues in both llvm_verify and jvm_verify that could be significantly improved if we could more faithfully translate terms between the two systems.

I'm starting to think that switching the saw-core Value type to use explicit closures, which capture the syntax of function bodies, would be a good idea.