crucible_llvm_verify, Proof failed stops script

weaversa commented 6 years ago

I'm using crucible_llvm_verify to find satisfying instances of some property, but I only seem to be able to find one at a time. That is, once crucible_llvm_verify returns a counter example, it stops the rest of the proof script from running. Is there a way to have it return a counter example but continue running the rest of the script?

brianhuffman commented 6 years ago

Currently there's not a way to keep running when verification fails with a counterexample. Here are some ideas for how we might implement what you want:

Add a new command (maybe crucible_llvm_sat) that is supposed to find and return a counterexample/witness.
Modify crucible_llvm_verify to return a sum type (either override or counterexample) which you could then case on.
Add a generic exception mechanism to saw-script. Then crucible_llvm_verify could throw a "counterexample" exception which could be caught by a user-provided handler.

I really don't like 2, because sum types and case combinators are really clunky to use. Option 1 would probably be the easiest to implement. But I think I like option 3 the best because we could probably get a lot of mileage out of exceptions: It would let us add new functionality to existing saw-script primitives without having to add new ones.

What do you think?

brianhuffman commented 6 years ago

I suppose there's also another option:

When a top-level saw-script statement fails, just charge ahead without adding a binding for its result value to the interpreter environment.

However, I think 4 would be a bad idea, at least to have such a feature enabled by default. If you don't bind the result to a variable, it would make it too easy not to notice when something fails. And if you do bind the result, you would get some low-level failures later when you try to use it.

robdockins commented 6 years ago

Actually, there is a fairly new fails combinator that succeeds exactly when the contained saw-script action fails. It was introduced for unit-testing purposes, but could also be used for this.

sawscript> :h fails
Description
-----------

    fails : {a} TopLevel a -> TopLevel ()

Run the given inner action and convert failure into success.  Fail
if the inner action does NOT raise an exception. This is primarily used
for unit testing purposes, to ensure that we can elicit expected
failing behaviors.

weaversa commented 6 years ago

prepending r <- fails to (crucible_llvm_verify ...) works for me.

I like option 2 the best because it works with the existing paradigm. Could you explain how option 3 (exceptions) could be used to provide overrides and allow decision making on proof results/counter examples (similar to caseSatResult and caseProofResult).

TomMD commented 6 years ago

I very much like option 2. Currently SAW fails in a noisy way and I parse the stdout to get a counter example (for consumption by an automated tool). Worse, the failure doesn't explain which of potentially many verification failed.

With option 2 we could do things like combine the currently-separate quickcheck and llvm_verify saw scripts to save on symbolic execution and still easily distinguish at which point the system failed (either using a custom exit code or searching stdout for the failure string I'd embed in the .saw).

brianhuffman commented 6 years ago

I like option 3 (exceptions) because it is the most general: You could use exception handlers to define fails, or to define a function like isSuccessful : TopLevel a -> TopLevel Bool that tells you whether or not an exception occurred:

let isSuccessful action =
  handle (do {action; return true;}) (\e -> return false);

let fails action =
  do {
    b <- isSuccessful action;
    if b then fail "Expected failure" else return ();
  };

If we added generic Maybe or Either types, then we could also implement option 2 in terms of option 3:

let try action =
  handle (do {x <- action; return (Just x);}) (\e -> return Nothing);

let try' action =
  handle (do {x <- action; return (Right x);}) (\e -> return (Left e));

The reason I don't like option 2 is that it makes the common case (where everything is expected to succeed normally) very noisy: In a long series of crucible_llvm_verify commands, you would have to case on the result of each one. In Haskell you can get around this problem by defining your own monad, but we don't have that ability in saw-script.

TomMD commented 6 years ago

The reasoning from @brianhuffman makes good sense to me. If we fail to define helpers as shown then things will get messy due to each script including its own boilerplate. In other words, I think a SAW prelude and option 3 would complement each other nicely.

weaversa commented 6 years ago

O.K. Option 3 with some nice helper functions in a prelude would work.

brianhuffman commented 6 years ago

Regarding a saw-script prelude, I just created #253.

atomb commented 6 years ago

Another approach to the same problem would be to use compositional extraction, something we've been discussing a lot and have partly implemented but haven't had a chance to finish yet. The idea is that given a spec of a similar form to that used by crucible_llvm_verify we could have a separate function that would instead of performing any verification simply return the term it has constructed to model the function you're analyzing (taking into account any overrides for called functions). Then, with this term in hand, you could do any sort of analysis you wanted, including iterating over as many solutions as you want.

Would that get you to what you're interested in more directly?

atomb commented 4 years ago

The option of allowing a proof to continue when one function fails is now available in the Python interface, and would probably be difficult to add to SAWScript.

GaloisInc / saw-script

crucible_llvm_verify, Proof failed stops script #252