VERIFY PROOF bug - Githubissues

GinoGiotto commented 1 year ago

Hello, this is the bug report I presented on the Metamath group: https://groups.google.com/g/metamath/c/P-QEef6cz5k/m/-MoVqgAuAgAJ

I made a simple file called "example.mm" that I attached here ( I changed the format to .txt because github doesn't allow me to upload .mm files).

This file contains only one theorem, called "th" which has an incorrect proof.

The issue is that the command "verify proof *" doesn't see the mistake, here is the log sequence:

Metamath - Version 0.198 7-Aug-2021           Type HELP for help, EXIT to exit.
MM> read example.mm
Reading source file "example.mm"... 122 bytes
122 bytes were read into the source buffer.
The source has 7 statements; 2 are $a and 1 are $p.
SET EMPTY_SUBSTITUTION was turned ON (allowed) for this database.
No errors were found.  However, proofs were not checked.  Type VERIFY PROOF *
if you want to check them.
MM> VERIFY PROOF *
0 10%  20%  30%  40%  50%  60%  70%  80%  90% 100%
..................................................
All proofs in the database were verified in 0.01 s.
MM> show proof th /lemmon /all
1 ax             $a axiom
2 1 mp           $a thesis
MM>

But if I enter the proof assistant, it detects that something is wrong:

MM> prove th
Entering the Proof Assistant.  HELP PROOF_ASSISTANT for help, EXIT to exit.
?Error in step 2:  Could not simultaneously unify the hypotheses of "mp":
    $|$ hypothesis $|$
with the following statement list:
    $|$ axiom $|$
(The $|$ tokens are internal statement separation markers)
Zapping targets so we can proceed (but you should exit the Proof Assistant and
fix this problem)
(This may take a while; please wait...)
Step 1 cannot be unified.  THERE IS AN ERROR IN THE PROOF.
You will be working on statement (from "SHOW STATEMENT th"):
7 th $p thesis $= ... $.
Note:  The proof you are starting with is already complete.
MM-PA> show new_proof /lemmon /all
1 ax             $a hypothesis
                  = axiom
2 1 mp           $a thesis
MM-PA>

I think the sentence "Note: The proof you are starting with is already complete." shouldn't appear in this case since the proof is not really complete.

I'm using the latest Metamath version downloaded from https://us.metamath.org/#downloads

Regards

Gino

benjub commented 1 year ago

For what it's worth, as mentioned at https://groups.google.com/g/metamath/c/P-QEef6cz5k/m/zKb59h00AgAJ, the Python verifier mmverify.py correctly rejects the proof: $ python3 mmverify.py example.mm --verbosity 20 gives __main__.MMError: Proof stack entry ['axiom'] does not match essential hypothesis ['hypothesis'].

Also, setting EMPTY_SUBSTITUTION to OFF in metamath.c does not change anything, so the problem likely does not come from that.

benjub commented 1 year ago

For the sake of self-containment, the content of the mm file is:

$c axiom hypothesis thesis $.

ax $a axiom $.

${
   hyp $e hypothesis $.
   mp $a thesis $.
$}

th $p thesis $= ax mp $.

I changed the file a bit on my computer, and in particular changed the names of all labels, constants and variables, and the file is still incorrectly verified, so the bug does not come from a clash with a "magic word" hard-coded in the source. I also changed indentations, spaces and line breaks, and added some noise like extra comments, declarations, axioms, (in)correct (in)complete proofs, with always the same result.

I also recompiled metamath with gcc without optimizations ($ gcc m*.c -o metamath -Wall), though I did not try other compilers like clang or compcert.

If you delete the proof and try to prove it with the proof assistant MM-PA, it doesn't let you write that incorrect proof.

If I could put labels to issues in this repo, then for this issue I would use the label CRITICAL.

digama0 commented 1 year ago

@benjub I've given you write access to the repo since you have been helpful triaging issues and making PRs, but you should still put your PRs through review.

tirix commented 1 year ago

My quick thoughts, just looking at the theorem source file: The keywords axiom, hypothesis, and thesis are all type codes, similar to |-, wff, setvar or class in set.mm. In Metamath, those typecodes are usually just prefixes informing about the type of the rest of the statement, and substitutions only start after those typecode symbol. So metamath.exe is maybe ignoring them and only looking at the rest of the statements, which is, for all those statements, an empty formula. Since all formulas are empty, metamath.exe maybe considers them equal and validates the theorem, without looking further.

So to fix this, I would look for missing "typecode checks" when checking substitutions.

digama0 commented 1 year ago

Nope, that's what I thought as well but the issue replicates even without any typecode funny business:

$c |- A B C $.

ax $a |- A $.

${
   hyp $e |- B $.
   mp $a |- C $.
$}

th $p |- C $= ax mp $.

benjub commented 1 year ago

It looks like the problem comes from the complete absence of variables. If you add a variable, say fx $f E x $. and you add it either in the hypothesis or the conclusion of mp, that is, either mp $a |- C x or hyp |- B x with correct adjustments (e.g. ax $a|- A x, and always write a correct proof to serve as a "control theorem" (e.g., th2 $p |- C $= fx fx h mp $. with h the $a-statement identical to hyp), then metamath always detects the error.

So it looks like there is a check that gets triggered whenever a part of the program sees a variable, but which should be triggered even if no variables are present.

david-a-wheeler commented 1 year ago

For what it's worth, as mentioned at https://groups.google.com/g/metamath/c/P-QEef6cz5k/m/zKb59h00AgAJ, the Python verifier mmverify.py correctly rejects the proof:

Obviously this is a bad bug in metamath-exe, and hopefully #127 will completely fix it.

That said, this is a good argument for why we run multiple provers. The verification has to pass all provers used, not just one. That's not impossible, but hopefully that reduces the likelihood of it, and now we have an example of this situation.

benjub commented 1 year ago

@digama0 : Thanks for the fix ! I'm happy that was the good diagnosis. Does this justify a v0.199 ?

digama0 commented 1 year ago

I was also thinking of releasing a new version. I'm going to skip to v0.200, both because it's a round number to signify new maintenance, and also because there were some rumors of last minute work by Norm which will be released as v0.199 if it is unearthed in the future.

Moving forward, I would like to move to a more continuous development model though, setting up the build script to put the git commit and commit date itself into the program so that we don't need to have versions as such. That will take some work to figure out though, I suppose.

david-a-wheeler commented 1 year ago

I think have version #s for end-users is helpful.

We already use a continuous development model for the website. The commit verification process, and the website daily regeneration process always downloads the current version of the primary (master) branch.

digama0 commented 1 year ago

If metamath prints Metamath v0.200 2023-01-24 (commit f00fbaba) then I think the version number becomes fairly cosmetic, the latter information is actually what we want to use for debugging. Currently this is an issue since right now there are many versions of metamath being used by the website and downloaded etc which all have the same header text, so when you are looking at a bug report you can't tell which version is involved.

At least, I have a poor track record for determining when to cut a release, so I like it when the development model doesn't have that as a requirement.

GinoGiotto commented 1 year ago

Just noticed this issue I presented has been closed. Thanks for fixing this bug!

metamath / metamath-exe

VERIFY PROOF bug #107