HOL-Theorem-Prover / HOL

Canonical sources for HOL4 theorem-proving system. Branch develop is where “mainline development” occurs; when develop passes our regression tests, master is merged forward to catch up.
https://hol-theorem-prover.org
Other
606 stars 132 forks source link

HolSmt: fix Z3 proof replay in real arithmetic with rational coefficients #1188

Closed someplaceguy closed 5 months ago

someplaceguy commented 5 months ago

Sorry, while working on the newer Z3 proof reconstruction, I unexpectedly ran into another easy fix, hence this PR.

Namely, due to a comment in the self-tests I was under the impression that Z3 2.19 wasn't printing real constants correctly, but it turns out that it does (I suspect the issue may be with Z3 2.19.1 instead). Hence, it is possible to fix the proof reconstruction for the remaining real-related test cases.

After a quick investigation, it was easy to figure out the problem.

I will reproduce the initial part of the commit message here (see the commit for more details):

It turns out that the existing Z3 proof reconstruction code was using RealArith.REAL_ARITH to replay linear arithmetic steps in proofs.

However, RealArith.REAL_ARITH only handles integral coefficients and therefore proof steps such as (x: real) / 1r = x were failing.

To fix this, we simply switch to RealField.REAL_ARITH, which supports rational coefficients.

This allows us to enable proof reconstruction for 3 more self-tests, all related to the reals. This also means that all real-related tests in the self-tests now work with proof reconstruction.

Another commit was also added that simply eliminates a few self-tests that were duplicated.

Edit: instead of removing the test cases, they were fixed to do what was almost surely intended, which is to test the reals in the same way the nums and ints are being tested.

Thanks!

someplaceguy commented 5 months ago

Oh, I just noticed that realLib.REAL_ARITH also exists:

  fun REAL_ARITH tm      = RealArith.OLD_REAL_ARITH tm
                           handle HOL_ERR _ => RealField.REAL_ARITH tm;

I'm sure you're aware of this, but for anyone who might not be aware, RealArith.OLD_REAL_ARITH is described in src/real/selftest.sml as:

The original version and old port by Hurd

I wonder why realLib's version first tries the old port and only if it fails, it tries the new one. Is it because the old port is faster (or solves different goals)?

If the realLib version is preferred, I can switch the code to use realLib.REAL_ARITH rather than RealField.REAL_ARITH directly.

binghe commented 5 months ago

The reason that realLib's REAL_ARITH first tries the old port and then the new one (after #1043), is because the old port can solve some (less interesting) inputs that the new port (same as HOL-Light's current version) cannot. To strictly align with HOL-Light's REAL_ARITH (for porting proofs from HOL-Light to HOL4), I chose to not (this is not actually true, as further explained in later comments, and the old port is 2x-3x faster) further enhance the new port to cover those (known) extra inputs.

When HolSmt was written, HOL4's REAL_ARITH was the old port, and the original author of HolSmt has only REAL_ARITH with integral coefficients. I'm glad to see that the new port of REAL_ARITH can also enrich the set of solvable cases of HolSmt.

binghe commented 5 months ago

Your current choice of using RealField.REAL_ARITH is good, I think. It's slightly slower but given the HolSmt code has no ability to actively choose different versions (by checking the coefficients), it has no choice but to always use the full version.

The interface provided by realLib is basically for users who doesn't care which version to use.

someplaceguy commented 5 months ago

Thank you for the background info, that clarifies things for me!

So it sounds like it'd be better to use realLib rather than RealField, since that would allow HolSmt to reconstruct more proofs.

I will fix the PR accordingly. Thanks!

Edit: I just saw your latest comment. Now I am unsure again!

binghe commented 5 months ago

Thank you for the background info, that clarifies things for me!

So it sounds like it'd be better to use realLib rather than RealField, since that would allow HolSmt to reconstruct more proofs.

I will fix the PR accordingly. Thanks!

Edit: I just saw your latest comment. Now I am unsure again!

There's no difference (on the number of solvable cases) for your two choices at this moment, because I don't think the extra problems that OLD_REAL_ARITH can solve, may appear inside HolSmt. Those "extra" problems solvable by OLD_REAL_ARITH are something like |- f (x + x) = f (2 * x), where f is an arbitrary uninterpreted function irrelevant to reals. Basically, by doing some extra rewritings, the new REAL_ARITH can also solve them.

someplaceguy commented 5 months ago

Ok, sorry, I was really confused for a moment.

Those "extra" problems solvable by OLD_REAL_ARITH are something like |- f (x + x) = f (2 * x), where f is an arbitrary uninterpreted function irrelevant to reals.

Yes, I think those kind of problems (such as the one you mentioned) are already being handled by other proof rules (like monotonicity, in this case), so I think RealField should be enough, as you say. And of course, if it turns out not to be the case, it can always be fixed later if necessary.

So I will leave the PR as it is then.

Thank you very much and sorry for the noise!

binghe commented 5 months ago

Ok, sorry, I was really confused for a moment.

Those "extra" problems solvable by OLD_REAL_ARITH are something like |- f (x + x) = f (2 * x), where f is an arbitrary uninterpreted function irrelevant to reals.

Yes, I think those kind of problems (such as the one you mentioned) are already being handled by other proof rules (like monotonicity, in this case), so I think RealField should be enough, as you say. And of course, if it turns out not to be the case, it can always be fixed later if necessary.

So I will leave the PR as it is then.

Thank you very much and sorry for the noise!

OK, I check the code, the known cases that HOL-Light's latest version cannot solve were something like |- closed {x | 1 * x = a} <=> closed {x | x = a}, found when building core library with the new port. Then I put some extra code into RealArith.sml (at line 2361) to make sure such cases can also be solved. But there's no way to know what else problems that the old port can do but the new one cannot, because their implementations are completely separate, no shared code at all.

Latest HOL-Light:

# REAL_ARITH `{x | 1 * x = a} = {x | x = a}`;;
Exception: Failure "linear_ineqs: no contradiction".

HOL4:

> RealArith.REAL_ARITH ``{x | 1 * x = a} = {x | x = a}``;
<<HOL message: more than one resolution of overloading was possible>>
val it = |- {x | 1 * x = a} = {x | x = a}: thm

HOL4 can do it only because I added one line (2361) in RealArith.sml with some historical notes:

    (* NOTE: this step of POLY_CONV helps by cutting off real arith terms
       hidding in propositional terms, e.g. ‘closed {x | 1 * x = a}’ will
       be simplified to ‘closed {x | x = a}’ before going to NNF_CONV.
       Some HOL4 proofs rely on this. *)
    TOP_DEPTH_CONV POLY_CONV THENC

This is mostly to make sure that unknown user proofs using the old REAL_ARITH continue to work, although the proof is easily fixable.

someplaceguy commented 5 months ago

Wait, I made a mistake with those test cases after all. The test cases are duplicated indeed, but now I think the correct solution is not to remove the duplicates, but rather to convert them to use reals instead of nums, so that the same pattern of tests is followed for both nums and reals.

I will fix this PR soon to do that.

Thank you for bringing this to my attention @binghe, I'd almost surely have missed this issue otherwise. It's easy to get lost in the sea of tests :sweat_smile:

someplaceguy commented 5 months ago

Update: instead of eliminating the test cases, I've changed them to cover the reals now, which is almost surely what was originally intended: to test these inequalities in the reals in the same way that they are being tested in the nums and the ints.

Thanks!

mn200 commented 5 months ago

Thanks both!