dafny-lang / dafny

Dafny is a verification-aware programming language
https://dafny.org
Other
2.91k stars 261 forks source link

CVC5 support? #4100

Open hmijail opened 1 year ago

hmijail commented 1 year ago

(Not sure whether to focus this as a documentation request, feature request or bug report)

My team’s Dafny code is related to cryptography and hardware simulation so we're having problems seemingly caused by non-linear arithmetic. We saw that CVC5 supports a theory finite fields, so we hope it would work better than Z3 for what we do.

We saw that Boogie has some partial support for CVC5, so I tried running Dafny's integration tests with CVC5 to see how much of what Dafny does falls on the subset of Boogie/CVC5 that works.

I tried using the existing integration tests infrastructure but I had to resort to hack a couple of files to hardwire CVC5 options. The result is that some tests pass but some fail, or run forever, or hang, and the test suite soon stops.

So I thought I'd ask directly.

  1. How to know when a different solver might be helpful?
  2. Can it help to just drop in a new solver? Or would changes be required in Dafny/Boogie to actually use any different functionality like CVC5’s finite fields theory?
  3. Is there an aspiration to support CVC5 from Dafny? The documentation mentions that the solver can be replaced, but no examples are given, and the Z3 integration seems deep.
  4. If yes, would you be happy for us to try to work on CVC5 integration and contribute it?
  5. Could you give some estimate of the difficulty?

Some detail on the hardwiring for CVC5, in case it helps to improve support:

The LitTests.cs file implies that one can use DAFNY_EXTRA_TEST_ARGUMENTS to add general arguments, but this hardly works because it seems to only accept 1 argument and assumes the Dafny v3 CLI (which probably compounds with issue #3837)

I had to add CVC5 and Boogie options in DefaultArgumentsForTesting in DafnyDriver.cs.

    "/proverOpt:SOLVER=CVC5",
    "/proverOpt:PROVER_PATH=/Users/mija/Downloads/cvc5-macOS",
    "/proverOpt:TIME_LIMIT=60000",
    "/proverOpt:C:--tlimit=60000",

However, even after adding the tlimit and TIME_LIMIT options the tests that before ran forever now seem to just hang. So there is no way to progress past the few initial tests.

atomb commented 1 year ago

At the moment, Dafny is known not to work well with CVC5. However, it's definitely an aspiration of ours to make it work better, and there are some ongoing efforts to make it work better. The particular application domain you're considering is one of the ones I have in mind that I'd like to be able to support better in Dafny, and supporting a variety of other solvers is probably part of the path there.

Some of the things that I know we need to do to better support CVC5:

I'm currently working on some things in this direction, especially the first bullet. Contributions would be welcome! We should probably coordinate first, to make sure we're doing compatible things and not duplicating work, though. I'd also love to see examples you'd like to have work well but that don't currently.

DavePearce commented 1 year ago

Hey @atomb,

So, it does sound like a fair bit of work is required then --- but at least it is something you are already thinking about! I'm interested in this comment:

The particular application domain you're considering is one of the ones I have in mind that I'd like to be able to support better in Dafny, and supporting a variety of other solvers is probably part of the path there.

So, do you have any reason to think CVC5 might perform better than Z3 here? Or, is it just a case of: more solvers == more flexibility?

We should probably coordinate first, to make sure we're doing compatible things and not duplicating work, though. I'd also love to see examples you'd like to have work well but that don't currently.

Well, definitely happy to talk more about this. Examples we can definitely come up with! We are currently investigating whether we can reduce verifier effort required by extracting key properties into e.g. datatypes, etc.

atomb commented 1 year ago

So, do you have any reason to think CVC5 might perform better than Z3 here? Or, is it just a case of: more solvers == more flexibility?

There are some places where CVC5 has an edge. There's more thorough support for externally-checkable proofs, and new work on int blasting and the seq type. It's hard to say in advance whether there'd be performance benefit -- there isn't at the moment, with so many quantified axioms in the Dafny prelude -- but I'd like to find out.

We should probably coordinate first, to make sure we're doing compatible things and not duplicating work, though. I'd also love to see examples you'd like to have work well but that don't currently.

Well, definitely happy to talk more about this. Examples we can definitely come up with! We are currently investigating whether we can reduce verifier effort required by extracting key properties into e.g. datatypes, etc.

On the topic of datatypes, I've been considering re-stating some of the axioms in the Dafny prelude, which currently emulate datatypes, using datatypes in Boogie and therefore the SMT datatype theory.

DavePearce commented 1 year ago

Hey @atomb,

Just following up on this, as we are definitely interested in pursuing this. Hopefully will have some capacity in a few weeks. I'm wondering what the minimal change required is to get something (anything) written in Dafny to verify with CVC5?

Looking at the above list you wrote, I'm guessing that updating the prelude is key here. i.e. because it is included in every file which would be passed to CVC5. With changes made to that, we would then presumably be able to begin experimenting with CVC5 for small Dafny examples (e.g. which didn't lead to multi-dimensional arrays being generated, etc). Is that right?

atomb commented 1 year ago

Very simple Dafny programs can be verified by CVC5 right now, sort of...

I just discovered an issue (#4196) that leads to Dafny failing if you select a solver other than Z3 (using /proverOpt:SOLVER=... or --solver-option SOLVER=.... However, if I can verify the following Dafny program with CVC5:

method M(x: int) {
  assert x + x == x * 2;
}

To do so, with the current version of Dafny, I need to use /proverLog:true.smt2 and then run cvc5 --incremental true.smt2 after removing any :rlimit references (which would happen automatically if SOLVER=cvc5 worked). Once #4196 is fixed, verifying that program should be completely automatic. Dafny does pruning of the generated SMT to remove unused definitions and axioms, so programs that don't use complex features (such as the heavily-axiomatized collection types) should be okay.

However, you'll quickly get into cases that CVC5 can't solve, and I think you're exactly right that making modifications to the prelude is what we'll need to do to fix that. Dafny currently relies heavily on Z3's ability to quickly instantiate an absurd number of quantified axioms, and I think other solvers just can't keep up with that.

I'm currently working on a number of such simplifications with the primary goal of making verification more predictable (with Z3). It should also have the side effect of making things work better with CVC5 (and other solvers), too. Some of the things I have in mind to simplify right now:

I've made some progress on the first two, but they're not quite ready to include yet. I'd be very interested to hear about any experiments you do, too!

atomb commented 1 year ago

Oh, one other thing: Boogie supports several encodings for polymorphism, which Dafny currently uses heavily. Sometimes the non-default encodings can be faster with Z3, and might also make things work better with CVC5. Try /typeEncoding:a or /typeEncoding:m to see if either work better.

DavePearce commented 1 year ago

Ok, I will try that and at least get to the point where I can see CVC5 actually doing something :)

DavePearce commented 1 year ago

So, FYI, this does seem to work for me already (in Dafny 4.1):

dafny verify --solver-option SOLVER=CVC5 --solver-option C:"--incremental" --solver-path $CVC5_HOME/cvc5 somefile.dfy

I am noticing situations where it doesn't verify things though.

atomb commented 1 year ago

Yeah, not being able to verify things that Z3 can is probably to be expected for now, but I expect the situation to get better.

I think the explicit --solver-path may be what allows you to get around #4196.

DavePearce commented 1 year ago

Some (not super scientific) results on /typeEncoding:a (left) versus /typeEncoding:p (right). The number shows how many verification conditions were returned with "unknown" for the given option. So, smaller is better :)

benchmarks/AlexExample.dfy:     0       1
benchmarks/ArrayMin.dfy:        0       1
benchmarks/BugTest.dfy: 2       4
benchmarks/bullCow.dfy: 1       3
benchmarks/ExtendedGCD.dfy:     1       1
benchmarks/fuel.dfy:    3       3
benchmarks/gadget.dfy:  0       0
benchmarks/induction.dfy:       0       0
benchmarks/math.dfy:    0       0
benchmarks/quadcost.dfy:        0       0
benchmarks/stackoverflow.dfy:   1       2
benchmarks/Traversal.dfy:       3       4

This is basically over a bunch of random (smallish) Dafny programs I had lying around using the options --incremental --rlimit=100000 for cvc5. Seems pretty conclusive that /typeEncoding:a is always the best option.

atomb commented 1 year ago

I've done some similar experiments and found that /typeEncoding:a is almost always the best option over a few hundred large source files, including some pretty difficult proofs. It causes a few integration tests to fail, but I think those are resolvable without too much effort.

DavePearce commented 1 year ago

Yeah, not being able to verify things that Z3 can is probably to be expected for now

Can you give us some insight on this? Is it CVC5 or is it more related to the translation issues from Dafny?

atomb commented 1 year ago

Yeah, not being able to verify things that Z3 can is probably to be expected for now

Can you give us some insight on this? Is it CVC5 or is it more related to the translation issues from Dafny?

My general sense so far is that Dafny is currently set up in such a way that it depends heavily on efficient, large-scale quantifier instantiation. Z3 does that really well, whereas CVC5 is pickier about which quantifiers it instantiates. Whether that's a Dafny or CVC5 problem is pretty subjective. :) But I'm hoping some of the stuff I'm working on will make it possible to, at least optionally, not depend so heavily on so much quantifier instantiation.

atomb commented 1 year ago

Some recent developments:

DavePearce commented 1 year ago

Fantastic!!