david-a-wheeler / mmverify.py

Metamath verifier in Python
MIT License
35 stars 10 forks source link

Issues of the hidden state #22

Open jp-guo opened 11 months ago

jp-guo commented 11 months ago

Hi, David. This repo is very helpful for my research. Previously I worked on Lean, which has corrsponding "state" after each command input. These states can be used as inputs to machine learning models to train and predict the next step. I am wondering if metamath has the same property, and whether there exists methods to represent these hidden states explicitly. I sincerely appreciate any insight you can provide.

tirix commented 11 months ago

Hi, I'll jump in an give an answer, but others (David included) might too. First you might prefer to send such questions on our mailing list, where you'd reach more people. The more active GitHub repository is set.mm, which is probably also a good channel.

A good first step would be to check the existing literature. There were already some attempts and publications done on automatic theorem proving using metamath, notably with GPT-3. Our Google Group also has discussions about how to facilitate machine learning from Metamath proofs.

As far as hidden states in metamath are concerned, I believe a good starting point could be the chapter 4.3 of the Metamath book, "the Anatomy of a Proof", which explains how the inner state of a metamath verifier is a stack of statements.

Another idea would be to work on the "MMP" textual representation of proofs as discussed in the mailing list. For example, you could use the -e option of metamath-knife to export proofs in the MMP format. The command

metamath-knife set.mm -e nnaddcl

Creates a file named nnaddcl.mmp which looks like this:

$( <MM> <PROOF_ASST> THEOREM=nnaddcl  LOC_AFTER=?

* Closure of addition of positive integers, proved by induction on the
  second addend.  (Contributed by NM, 12-Jan-1997.)

1::oveq2                   |- ( x = 1 -> ( A + x ) = ( A + 1 ) )
2:1:eleq1d                |- ( x = 1 -> ( ( A + x ) e. NN <-> ( A + 1 ) e. NN ) )
3:2:imbi2d               |- ( x = 1 -> ( ( A e. NN -> ( A + x ) e. NN ) <-> ( A e. NN -> ( A + 1 ) e. NN ) ) )
4::oveq2                   |- ( x = y -> ( A + x ) = ( A + y ) )
5:4:eleq1d                |- ( x = y -> ( ( A + x ) e. NN <-> ( A + y ) e. NN ) )
6:5:imbi2d               |- ( x = y -> ( ( A e. NN -> ( A + x ) e. NN ) <-> ( A e. NN -> ( A + y ) e. NN ) ) )
7::oveq2                   |- ( x = ( y + 1 ) -> ( A + x ) = ( A + ( y + 1 ) ) )
8:7:eleq1d                |- ( x = ( y + 1 ) -> ( ( A + x ) e. NN <-> ( A + ( y + 1 ) ) e. NN ) )
9:8:imbi2d               |- ( x = ( y + 1 ) -> ( ( A e. NN -> ( A + x ) e. NN ) <-> ( A e. NN -> ( A + ( y + 1 ) ) e. NN ) ) )
10::oveq2                  |- ( x = B -> ( A + x ) = ( A + B ) )
11:10:eleq1d              |- ( x = B -> ( ( A + x ) e. NN <-> ( A + B ) e. NN ) )
12:11:imbi2d             |- ( x = B -> ( ( A e. NN -> ( A + x ) e. NN ) <-> ( A e. NN -> ( A + B ) e. NN ) ) )
13::peano2nn             |- ( A e. NN -> ( A + 1 ) e. NN )
14::peano2nn                |- ( ( A + y ) e. NN -> ( ( A + y ) + 1 ) e. NN )
15::nncn                      |- ( A e. NN -> A e. CC )
16::nncn                      |- ( y e. NN -> y e. CC )
17::ax-1cn                     |- 1 e. CC
18::addass                     |- ( ( A e. CC /\ y e. CC /\ 1 e. CC ) -> ( ( A + y ) + 1 ) = ( A + ( y + 1 ) ) )
19:17,18:mp3an3               |- ( ( A e. CC /\ y e. CC ) -> ( ( A + y ) + 1 ) = ( A + ( y + 1 ) ) )
20:15,16,19:syl2an           |- ( ( A e. NN /\ y e. NN ) -> ( ( A + y ) + 1 ) = ( A + ( y + 1 ) ) )
21:20:eleq1d                |- ( ( A e. NN /\ y e. NN ) -> ( ( ( A + y ) + 1 ) e. NN <-> ( A + ( y + 1 ) ) e. NN ) )
22:14,21:syl5ib            |- ( ( A e. NN /\ y e. NN ) -> ( ( A + y ) e. NN -> ( A + ( y + 1 ) ) e. NN ) )
23:22:expcom              |- ( y e. NN -> ( A e. NN -> ( ( A + y ) e. NN -> ( A + ( y + 1 ) ) e. NN ) ) )
24:23:a2d                |- ( y e. NN -> ( ( A e. NN -> ( A + y ) e. NN ) -> ( A e. NN -> ( A + ( y + 1 ) ) e. NN ) ) )
25:3,6,9,12,13,24:nnind |- ( B e. NN -> ( A e. NN -> ( A + B ) e. NN ) )
qed:25:impcom          |- ( ( A e. NN /\ B e. NN ) -> ( A + B ) e. NN )

In this representation, each line represents one atomic step in the proof. The final step qed is the final statement of the theorem. This representation is rather a graph or tree where is node holds pointers to its predecessors (like e.g. 20 requires 15,16 and 19 to be proven). Any partial extract of this file, like for example only lines 1-20, could be an internal/intermediate step for an automatic prover. This format is also the one used for some proof assistants like Yamma and MMJ2.

I hope this helps!

jp-guo commented 11 months ago

Great! The community and methods you provide are very helpful. Thanks for your reply!