metamath / metamath-knife

Metamath-knife can rapidly verify Metamath proofs, providing strong confidence that the proofs are correct.
Apache License 2.0
25 stars 9 forks source link

Verify definitions #116

Open tirix opened 1 year ago

tirix commented 1 year ago

This is an in progress version of a pull request adding the definition soundness checks. All checks are not implemented yet, and errors are not handled (the function panics if anything goes wrong).

The intend is to follow the algorithm provided by @digama0 in #103 to identify definitions.

Based on the identification of definitions, this also adds a function to export the dependencies of definition into a GraphML file format. Redundant dependencies are not checked (for example, df-ipf includes two conjunctions, so the edge between df-ipf and df-an is doubled).

Currently this algorithm fails for these reasons:

tirix commented 1 year ago

@david-a-wheeler Sorry this is kind of overrules your #103 , but there seemed to be some kind of urgency in introducing these changes now.

david-a-wheeler commented 1 year ago

@tirix - no problem! I started writing definitional soundness code because I wanted the functionality, but I always seem to "run out of time" to complete it. Thanks for picking up the effort, I'm delighted to see it.

tirix commented 1 year ago

@digama0 Could you please confirm that this corresponds to what you were describing?

digama0 commented 1 year ago

I pushed some more modifications, the current error reports are basically all true positives now, and it might help to start working on them in parallel with the checker itself. Summary of issues:

tirix commented 1 year ago

Thanks for all the additions! I think we only have those items to check now:

david-a-wheeler commented 1 year ago

FYI, I've closed my PR #103, I'm expecting this PR to be the solution instead. Thank you @tirix for putting in this effort!

tirix commented 1 year ago

As @digama0 pointed out, there is still a lot to be done, and he's probably contributed more than I did even if I did the initial sketch. This is already quite a big PR, so I would prefer to merge it now, and then flesh out the remaining parts about which variables have to be distinct and which not - which Mario expects to be complex.

david-a-wheeler commented 1 year ago

@tirix -

+1 to merging "functionality so far".

It's more than we had before, and I'd rather merge capabilities by pieces.

digama0 commented 1 year ago

This is just half of an implementation though, it is still under construction. Certainly I wouldn't want it in a release. And I don't see the value of merging a half-completed feature branch.

tirix commented 1 year ago

This is just half of an implementation though, it is still under construction. Certainly I wouldn't want it in a release.

I wanted to make sure there is a clear baseline before launching a big refactoring. We have several applications relying on metamath-knife, and it's always better if they can point to something stable interface-wise. But you're right it's better to have complete functionalities before releasing, even if those functionalities are not used by any customer in any way yet. So we will in order:

And I don't see the value of merging a half-completed feature branch.

Even if we don't have the complete MMJ2 functionality in place yet, this is already "completely" functional in the sense that we can already launch the definition checker, detect a certain number of actual/real definition problems, and there is no crash / known bug in the current code.

Also, I generally prefer small PR, they are easier to review, and we can see the incremental changes made. We have the CI in place to ensure we are not breaking anything. I did not see this as a feature branch, just a single PR, and it has passed my threshold for small incremental change. Maybe the PR name showed too much ambition? :)

Anyway, if you prefer we can also continue to work in this branch, within this PR, until we have the complete functionality, that's fine for me too!

david-a-wheeler commented 1 year ago

If it implements only some of the rules, but you can invoke the code to check those rules, I suggest merging it.

That will give us a more-stable base to build on.

For clarity perhaps we should document "incomplete implementation" in the help info, to make it clear that it only implements some of the checks. That way no one will accidentally depend on it being the full implementation.

digama0 commented 1 year ago

We have several applications relying on metamath-knife, and it's always better if they can point to something stable interface-wise.

Note that the API for definition checking changes in every commit, I wouldn't commit to anything until it's feature-complete. And it has some effects on the Formula API too, since this is the first kind of real theorem-prover-ish thing we are trying to do with metamath-knife inside the main repo.

digama0 commented 1 year ago

If it implements only some of the rules, but you can invoke the code to check those rules, I suggest merging it.

That will give us a more-stable base to build on.

For clarity perhaps we should document "incomplete implementation" in the help info, to make it clear that it only implements some of the checks. That way no one will accidentally depend on it being the full implementation.

When it comes to writing a verifier, I don't like releasing it half baked when only some of the checks required for soundness are implemented. There is a very concrete end point for the implementation. If we want to release something incomplete, then it should err on the side of more false positives, but we already know that this will cause thousands of errors in set.mm so it won't be so helpful.

digama0 commented 1 year ago

Things are almost done now. All the DV checks are implemented, as well as the axiom / def naming checker, which revealed some additional issues that have been pushed to metamath/set.mm#3247 (in particular I put some things as primitive that were actually defined but where the definition was far away from the syntax and other disallowed stuff was in between). There are now three (expected) errors:

warning: Axioms should start with 'ax-'
     --> ../mm/set.mm:24406:3
      |
24406 |   df-clab $a |- ( x e. { y | ph } <-> [ x / y ] ph ) $.
      |   ------- This was identified as an axiom, but it doesn't start with 'ax-'
      |
warning: Axioms should start with 'ax-'
     --> ../mm/set.mm:24478:5
      |
24478 |     df-cleq $a |- ( A = B <-> A. x ( x e. A <-> x e. B ) ) $.
      |     ------- This was identified as an axiom, but it doesn't start with 'ax-'
      |
warning: Axioms should start with 'ax-'
     --> ../mm/set.mm:24547:5
      |
24547 |     df-clel $a |- ( A e. B <-> E. x ( x = A /\ x e. B ) ) $.
      |     ------- This was identified as an axiom, but it doesn't start with 'ax-'
      |

which is basically what I've been saying all along. Options now are to add some kind of override, or change the names.

tirix commented 1 year ago

Options now are to add some kind of override, or change the names.

This deserves a wider discussion. I think Norm wanted to keep the ax- prefix for actual axioms of set theory, so I prefer the former.

At least df-cleq can clearly be linked with the wceq syntax, and df-clel with wcel, why do they need to be declared primitives? For df-clab, obviously it's a special case.

Interestingly a definition command appears in set.mm, even though the metamath-knife def checker does not use it:

  $( Register '<->' as an equality for its type (wff). $)
  $( $j
    equality 'wb' from 'biid' 'bicomi' 'bitri';
    definition 'dfbi1' for 'wb';
  $)

I did not find its trace in MMJ2. Did I miss something?

tirix commented 1 year ago

Interestingly a definition command appears in set.mm

Actually this is defined in j_syntax.html:

$j definition 'DEFTHM' for 'SYNTAX';

Declare a theorem to be a definition. DEFTHM is a $p statement which represents an alternative definition for the syntax represented by the $a statement SYNTAX. The definition should have a top level equality declared by the equality command with the definition on the right hand side. (This command is only needed when the definition does not immediately follow the syntax itself, which in set.mm only occurs for df-bi. For most definitions we can automatically infer the requisite structure.)

digama0 commented 1 year ago

At least df-cleq can clearly be linked with the wceq syntax, and df-clel with wcel, why do they need to be declared primitives?

The answer to this is basically given by the same tool that is producing the output above. Being 'linked' to an axiom that does not match the requirements for a definition is not sufficient. In fact, the ax/df check is done early, before we have even done most of the fancy work on definitions: df-cleq fails immediately for failing the test "it introduces exactly one new symbol and is the first axiom to refer to that symbol". wceq has been used for hundreds of prior theorems before df-cleq comes along so it is obviously ineligible. The story is the same for df-clel.

Interestingly a definition command appears in set.mm, even though the metamath-knife def checker does not use it:

This was part of an alternative design for the definition justification checker, which would parse the definition theorem to supply the definition body, and then ensure that the justification matches. In the design implemented here, the justification is unified against the definitional axiom to obtain the LHS -> RHS mapping, so the definition is not needed. We can validate it if you think it is worth it though.

jkingdon commented 1 year ago

Options now are to add some kind of override, or change the names.

I would be a bit disappointed if we get stuck on which of these two to choose and end up not getting the definition verifier merged.

Based on my limited understanding, I guess an override is what we have been doing (maybe expressed in words, maybe in the mmj2 definition checker), and a name change is seemingly more intellectually honest in light of things like https://us.metamath.org/mpeuni/bj-ax8.html .

Disclaimer: although I have looked at things like https://math.stackexchange.com/questions/231087/what-can-i-do-with-proper-classes and https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_theory#Virtual_classes I'm not sure I understand this fully. I do gather that df-clab , df-clel , and df-cleq do not give us the ability to (say) quantify over classes, so in that sense the classes are still virtual. And I suppose maybe it would be true that one way out would be separate notations for = and e. for sets versus classes (like we already have for [ versus [.). I don't know how much of the problem this would solve, but I do suspect it would be inconvenient, or at least require changing a lot of proofs.

I'm not aware of any quick fix other than "some kind of override, or change the names" so we probably are left with those two choices for now regardless of what we do longer term.

digama0 commented 1 year ago

Quick comment, this PR isn't blocked on that issue (alone), I have some local work which continues the main implementation which got pushed off of my "to do immediately" list by something else, but which I plan to return to soon. I did prepare a version which just stubs out the missing part of the checker so that it could be merged, but as expected this leaves about 1300 errors, so it won't be immediately usable.

jkingdon commented 1 year ago

Quick comment, this PR isn't blocked on that issue (alone)

Thanks for the clarification. I'm mostly thinking of trying to coordinate:

  1. The definition work (however much of it we are able/interested in merging soon)
  2. Splitting into two or more crates, and
  3. Adding the checker for $j usage

Not that we have a detailed plan for exactly who does all of this and how but those seem like three of the most wanted enhancements that I've noticed.

digama0 commented 1 year ago

I'd prefer to wait on (2) until most of the in-flight additions have been merged. For (3) I think it can be worked on independently of this issue, they are not really conflicting.

jkingdon commented 1 year ago

I'd prefer to wait on (2) until most of the in-flight additions have been merged. For (3) I think it can be worked on independently of this issue, they are not really conflicting.

Makes sense. Looks like (2) is being discussed at #117 and (3) has a pull request at #118 .

tirix commented 7 months ago

I'm in favor of merging this first, and handling the remaining questions over df-clab, df-clel and df-cleq separately.

If this is merged, we could already implement a metamath-knife based definition checker in set.mm`s CI by just ignoring the 3 exceptions, until they are resolved.

jkingdon commented 7 months ago

I'm in favor of merging this first, and handling the remaining questions over df-clab, df-clel and df-cleq separately.

Just to confirm something I think I know the answer to: is the definition checker complete (assuming that it isn't supposed to handle those three)? That is, does it implement the rules at "Additional rules for definitions" in https://us.metamath.org/mpeuni/conventions.html ?

If this is merged, we could already implement a metamath-knife based definition checker in set.mm`s CI by just ignoring the 3 exceptions, until they are resolved.

I mean, this is what we do for mmj2, right? If so, it is hard to see merging this, and adding to CI, as a step backward.

digama0 commented 7 months ago

Since I have had a WIP commit sitting in my local copy for too long, I've pushed it to https://github.com/tirix/metamath-knife/compare/verify_definitions...metamath:metamath-knife:verify_definitions_2 .

tirix commented 7 months ago

So would @digama0 you like me to create a pull request with those changes, review them, and merge them to this branch?