HOL-Theorem-Prover / HOL

Canonical sources for HOL4 theorem-proving system. Branch develop is where “mainline development” occurs; when develop passes our regression tests, master is merged forward to catch up.
https://hol-theorem-prover.org
Other
621 stars 140 forks source link

HolSmt: add support for `num` type, fix proof replay, build smtheap #1206

Closed someplaceguy closed 6 months ago

someplaceguy commented 6 months ago

Hi! This PR is a bit larger than my previous ones. No code outside HolSmt was changed. As for HolSmt, this PR contains the following changes:

Add (some) support for the num type to HolSmt

The current approach is very simple. It consists in doing the following:

This is really all that's needed to add some initial support for nums, although I'm sure more theorems could be added and/or optimizations could be performed later if necessary. It allows SMT solvers to solve all of the existing num self-tests, except for the DIV and MOD-related ones, because integer div and mod are not supported yet (this will be fixed in the next PR).

Note that there is a significant regression: by adding the num-related theorems as assumptions, SMT solvers cannot come up with sat results anymore. I've narrowed this down to a single theorem, integerTheory.INT, which is needed for SMT solvers to reason about SUC. My guess is that they are simply not able to come up with models for the & and SUC functions which satisfy the restrictions imposed by the theorem.

Since the self-tests rely on sat results to detect unprovable goals, and since a user might want to avoid theorems from being added (for performance or debugging reasons, perhaps), I've added a tunable (HolSmtLib.include_theorems := false) which can be used as an escape hatch to prevent theorems from being automatically added. However, I don't expect this to be something that one would normally use.

Z3 proof parser fixes

This was by far the most challenging part of the PR. The num support added previously led to Z3 creating more complicated proof certificates that we hadn't observed before. Specifically, Z3 is now mixing and adding proofterms within regular SMT-LIB terms and is nesting proof-specific let terms inside the bindings of regular SMT-LIB terms as well. Furthermore, Z3 proof certificates are a long chain of (many thousands of) let terms, so special care is required to parse these to avoid causing stack overflows.

I considered about 5 different ways of fixing this, tried and abandoned implementing 2 of them until I finally reached the current approach, which I think is by far the simplest one, since it avoids duplicating code and is relatively simple. The current approach consists in parametrizing the SMT-LIB term parser by a couple of let handler functions, which allows the parser to behave differently depending on whether we're parsing regular SMT-LIB terms or SMT-LIB terms inside Z3 proof certificates.

Another issue is that the indices in indexed identifiers could only be numerals but from SMT-LIB 2.5 forward they can also be symbols. This was fixed in a commit prior to this PR (8a2e9bc97c2955bdbda0bceb90948e87a92f5c22). However, Z3 proof certificates can actually contain full SMT-LIB terms as indices (as part of quant-inst inference rules), so we now parse them as a list of Term.term instead of as a list of strings.

Z3 proof replay fixes

This consisted in the following:

These fixes above allow us to replay the Z3 proofs of the quantifier tests, the double implication tests (exercising one of the Z3 bugs mentioned earlier) and all the arithmetic test cases in the HolSmt test suite, except for the word ones and the div and mod ones (which we don't support yet). I've actually implemented support for div and mod already, but proof replay will require more fixes and this PR was already getting too long so I will leave that for the next one.

Note that proof replay is still quite brittle, for a few reasons:

  1. Unimplemented proof rules -- there are still a few Z3 proof rules that aren't implemented, although I don't expect this to be a significant issue going forward, as they are very few and I don't foresee this to be a significant challenge.
  2. The existing proof rule handlers not covering all cases: this is mostly due to the fact that 1) when HolSmt was originally developed, Z3 was closed source so it wasn't easy to tell what the proof handlers should do, and 2) Z3 has evolved significantly for the past decade which means that some proof rules have been expanded in terms of their scope. The fixes for these issues are usually pretty straightforward, though.
  3. Rewrite rules: these are a lot more challenging since they are very ad-hoc, undocumented and currently HolSmt does its best to handle them but it's hard or impossible to tell which are the actual rewrite rules that need to be handled. But again, I do have a plan to tackle this issue (mentioned further below).
  4. Missing or hard-to-implement features: e.g. word-related proofs seem to need additional special handling which we don't have support for yet. Polymorphic functions are also causing issues which we can't handle yet. Furthermore, it's hard to tell whether proofs of nonlinear arithmetic will ever be possible to handle in a reasonable way -- this would probably require a lot of research and development to fix. And I suspect there will also be other hidden issues where Z3 will not give us enough information in the proof certificate to replay the proof, but I don't have solid examples of this yet.

Added an smtheap

This is a significant usability improvement (especially while developing), as it reduces the time loading "HolSmtLib" from ~42s to ~5s (on my laptop).

Plans for the immediate future

Thanks!

cc @tjark

mn200 commented 6 months ago

Thanks for all this ongoing work! It will probably cut you off in mid-flow, but unless you tell me you have something particularly urgent and pressing that needs to go in, I may well release Trindemossen-1 with just this code in place (i.e., and nothing further). The integer d.p. bugs (for example) are annoying, I admit, but there's always more to fix (and I love the idea of inducing changes in Z3 to make our lives easier as well).