Closed someplaceguy closed 6 months ago
Thanks for all this ongoing work! It will probably cut you off in mid-flow, but unless you tell me you have something particularly urgent and pressing that needs to go in, I may well release Trindemossen-1 with just this code in place (i.e., and nothing further). The integer d.p. bugs (for example) are annoying, I admit, but there's always more to fix (and I love the idea of inducing changes in Z3 to make our lives easier as well).
Hi! This PR is a bit larger than my previous ones. No code outside HolSmt was changed. As for HolSmt, this PR contains the following changes:
Add (some) support for the
num
type to HolSmtThe current approach is very simple. It consists in doing the following:
num
literals into integer literals, so that these can be translated by the already-implemented integer support. In other words, literals such as0n
and2n
are converted intoNum 0i
andNum 2i
. Since0i
and2i
are integer literals, these will be translated into actual numbers instead of uninterpreted constants. This conversion avoids having to do anything special for parsing, printing or for proof reconstruction, as integer literals are already supported andNum
will just be a function like any other from the SMT prover's perspective (that maps integers into natural numbers).num
operations into integer operations which they natively support and can reason about easily (see commit a49c84e09a1b45aca5acf9984fcecda75b127f22 for details).This is really all that's needed to add some initial support for nums, although I'm sure more theorems could be added and/or optimizations could be performed later if necessary. It allows SMT solvers to solve all of the existing
num
self-tests, except for the DIV and MOD-related ones, because integer div and mod are not supported yet (this will be fixed in the next PR).Note that there is a significant regression: by adding the
num
-related theorems as assumptions, SMT solvers cannot come up withsat
results anymore. I've narrowed this down to a single theorem,integerTheory.INT
, which is needed for SMT solvers to reason aboutSUC
. My guess is that they are simply not able to come up with models for the&
andSUC
functions which satisfy the restrictions imposed by the theorem.Since the self-tests rely on
sat
results to detect unprovable goals, and since a user might want to avoid theorems from being added (for performance or debugging reasons, perhaps), I've added a tunable (HolSmtLib.include_theorems := false
) which can be used as an escape hatch to prevent theorems from being automatically added. However, I don't expect this to be something that one would normally use.Z3 proof parser fixes
This was by far the most challenging part of the PR. The
num
support added previously led to Z3 creating more complicated proof certificates that we hadn't observed before. Specifically, Z3 is now mixing and adding proofterms within regular SMT-LIB terms and is nesting proof-specificlet
terms inside the bindings of regular SMT-LIB terms as well. Furthermore, Z3 proof certificates are a long chain of (many thousands of)let
terms, so special care is required to parse these to avoid causing stack overflows.I considered about 5 different ways of fixing this, tried and abandoned implementing 2 of them until I finally reached the current approach, which I think is by far the simplest one, since it avoids duplicating code and is relatively simple. The current approach consists in parametrizing the SMT-LIB term parser by a couple of
let
handler functions, which allows the parser to behave differently depending on whether we're parsing regular SMT-LIB terms or SMT-LIB terms inside Z3 proof certificates.Another issue is that the indices in indexed identifiers could only be numerals but from SMT-LIB 2.5 forward they can also be symbols. This was fixed in a commit prior to this PR (8a2e9bc97c2955bdbda0bceb90948e87a92f5c22). However, Z3 proof certificates can actually contain full SMT-LIB terms as indices (as part of
quant-inst
inference rules), so we now parse them as a list ofTerm.term
instead of as a list of strings.Z3 proof replay fixes
This consisted in the following:
nnf-neg
,nnf-pos
,mp~
,quant-inst
,proof-bind
,refl
andsk
proof rules and thelambda
binder, which we didn't have support for until now. Thennf-neg
andnnf-pos
handlers just invokeMETIS_TAC
for now, andproof-bind
seems to be a no-op, while the other ones were implemented with specialized handlers.quant-intro
andelim-unused
proof rule handlers to also handleexists
quantifiers (besidesforall
quantifiers which they already handled).COOPER_TAC
instead ofARITH_TAC
until the issue is fixed.METIS_TAC
to bridge the gap in the mismatched terms when the theorem doesn't come out as expected.simplify_implies
feature), which would cause these terms in the proof output not to match the terms in the input file that was provided to Z3. This issue had been causing proof replay failures in HolSmt for more than 13 years now (see the comment added at the bottom of commit 522a851b6671b54a4d93d6c91b22490802a2d6d7). The workaround consists in usingMETIS_TAC
andDrule.PROVE_HYP
at the end of the proof replay procedure to remove each hypothesis that doesn't exactly match one of the assumptions. When using a fixed version of Z3, the workaround shouldn't do anything (unless there's another bug) since the hypotheses will match the assumptions.These fixes above allow us to replay the Z3 proofs of the quantifier tests, the double implication tests (exercising one of the Z3 bugs mentioned earlier) and all the arithmetic test cases in the HolSmt test suite, except for the word ones and the div and mod ones (which we don't support yet). I've actually implemented support for
div
andmod
already, but proof replay will require more fixes and this PR was already getting too long so I will leave that for the next one.Note that proof replay is still quite brittle, for a few reasons:
Added an
smtheap
This is a significant usability improvement (especially while developing), as it reduces the time loading "HolSmtLib" from ~42s to ~5s (on my laptop).
Plans for the immediate future
num
theorems and do more testing). Enabling proof replay for these tests will probably require some significant additional fixes (from what I could tell at a glance), which will require some time. I'll probably send a PR when this is done.mk_rewrite
is used in the Z3 source code (currently ~55 instances) and assign a unique integer to each one, which would be printed out in the proof certificate as an index in therewrite
proofterm (e.g.((_ rewrite 23) (= P Q))
instead of just(rewrite (= P Q))
.mk_rewrite
Z3 used in a given proof certificate, which would allow us to quickly and easily identify the place in the Z3 source code where these rewrite terms are being created, which in turn would allow us to more easily identify the shape of rewrite rules that need to be handled by that instance ofmk_rewrite
. It would make it a lot easier to verify that we're handling all the necessary cases.rewrite
rule replay (for the dynamically-created rewrites, not the static ones which a term net can handle easily) since we could now jump directly to a specialized handler for a given rewrite rule (e.g. rewrite rule 23, rewrite rule 15, etc) instead of throwing everything at the wall until something finally works, which is mostly what we do currently in therewrite
handler.Thanks!
cc @tjark