metamath / set.mm

Metamath source file for logic and set theory
Other
251 stars 88 forks source link

Extract common proof patterns into lemmata #2710

Closed savask closed 2 years ago

savask commented 2 years ago

Some statements are often reproved in set.mm over and over. The following list was identified with computer search:

https://gist.github.com/savask/dce5b48b7317e8144635fda09e983fae

The first column is 'score' which correlates with the potential amount of saved bytes, second is the number of essential hypotheses, and the third is the list of essential steps of the proof, see https://groups.google.com/g/metamath/c/aXvD9rs36Ps for details.

Per suggestion of @benjub this issue will be used to record which theorems were incorporated into set.mm and which were deemed unworthy. By line number in the gist:

After a first batch of additions, the computer search will be re-run to update the list and the scores.

benjub commented 2 years ago

I looked at Line 107 on that list (as promised in https://github.com/metamath/set.mm/pull/2705#issuecomment-1193286776) and it gives:

  ${
    $d u x $.  $d u y $.  $d v x $.  $d v y $.  $d w x $.  $d w y $.
    $( Lemma for ~ axrep2 and ~ axrep3 . (Contributed by BJ, 6-Aug-2022.) $)
    axreplem $p |- ( x = y ->
                  ( E. u ( ph -> A. v ( ps <-> E. w ( z e. x /\ ch ) ) ) <->
                    E. u ( ph -> A. v ( ps <-> E. w ( z e. y /\ ch ) ) ) ) ) $=
      ( weq wel wb wa wex wal wi elequ2 anbi1 exbidv bibi2d albidv imbi2d syl )
      DEJFDKZFEKZLZABUDCMZGNZLZHOZPZINABUECMZGNZLZHOZPZINLDEFQUFUKUPIUFUJUOAUFU
      IUNHUFUHUMBUFUGULGUDUECRSTUAUBSUC $.
  $}

It shortens axrep2 and axrep3 by a total of 160 bytes, which is more than the proof length (total number of proof steps) of axreplem, so it "pays for itself" (barely). The proof of this kind of "building lemma" is basically automatic, but still better to prove it once than twice. So I slightly tend towards adding it to set.mm (especially since there are no other examples of that sort in the list). What do @savask and @avekens think ?

avekens commented 2 years ago

I looked at Line 107 on that list (as promised in #2705 (comment)) and it gives:

  ${
    $d u x $.  $d u y $.  $d v x $.  $d v y $.  $d w x $.  $d w y $.
    $( Lemma for ~ axrep2 and ~ axrep3 . (Contributed by BJ, 6-Aug-2022.) $)
    axreplem $p |- ( x = y ->
                  ( E. u ( ph -> A. v ( ps <-> E. w ( z e. x /\ ch ) ) ) <->
                    E. u ( ph -> A. v ( ps <-> E. w ( z e. y /\ ch ) ) ) ) ) $=
      ( weq wel wb wa wex wal wi elequ2 anbi1 exbidv bibi2d albidv imbi2d syl )
      DEJFDKZFEKZLZABUDCMZGNZLZHOZPZINABUECMZGNZLZHOZPZINLDEFQUFUKUPIUFUJUOAUFU
      IUNHUFUHUMBUFUGULGUDUECRSTUAUBSUC $.
  $}

It shortens axrep2 and axrep3 by a total of 160 bytes, which is more than the proof length (total number of proof steps) of axreplem, so it "pays for itself" (barely). The proof of this kind of "building lemma" is basically automatic, but still better to prove it once than twice. So I slightly tend towards adding it to set.mm (especially since there are no other examples of that sort in the list). What do @savask and @avekens think ?

I think for a lemma it is OK. If it was a standalone theorem (if it can be used in other proofs, which I do not see at the momen), I would suggest to replace the setvar variables x,y,z by class variables X,Y,Z, but this is not necessary for a lemma.

savask commented 2 years ago

It shortens axrep2 and axrep3 by a total of 160 bytes, which is more than the proof length (total number of proof steps) of axreplem, so it "pays for itself" (barely).

I thought a theorem "pays for itself" if the total amount of bytes in set.mm decreases (see "Acceptable shorter proofs" in https://us.metamath.org/mpeuni/conventions.html). In this case axreplem takes about 500 bytes, so the size of set.mm increases considerably. In my opinion, this lemma doesn't have a great mathematical meaning by itself, so I suggest to scrap it.

benjub commented 2 years ago

Too late, @savask ! I just PR'd it, after @avekens's approval. It's true that it's borderline, and with the statement and comment, the total size increases... I plan to make the most of this (single of its kind) addition by using it as an illustrative example and putting a comment in albidv or exbidv like "For an example of building various equality theorems from albi, exbi, anbi, imbi, bibi, orbi theorems, see ~ axreplem."

@avekens : I stick with setvar variables instead of class variables in order to use only elequ2 instead of eleq2, reducing axiom usage and size of the proofs using it (those proofs have to use three extra ~cv in the latter case)

benjub commented 2 years ago

@avekens : since you did fvoveq1 (line 23), can you look, when you have time, at all the lines in the list which use a *veq* theorem and have at most one essential hypothesis ? These are lines 55, 58, 63, 69, 90, 91.

avekens commented 2 years ago

I reconstructed the following theorem from line 55 (["oveq2","oveq1d","fveq2d","oveq1d","oveq12d"]):

  ov2fvov3eq24 $p |- (  B = X
           -> ( ( A .o. B ) O ( ( F ` ( ( C .x. B ) .X. D ) ) .+ E ) )
            = ( ( A .o. X ) O ( ( F ` ( ( C .x. X ) .X. D ) ) .+ E ) ) ) $=
    ( wceq co cfv oveq2 oveq1d fveq2d oveq12d ) BKMZABLNAKLNCBFNZDGNZ
    IOZHENCKFNZDGNZIOZHENJBKALPTUCUFHETUBUEITUAUDDGBKCFPQRQS $.

This is not only looking awkward (especially because it contains 5 operations, and it was difficult to find appropriate symbols to represent them, and also to find an adequate name for this theorem, meaning "2 ov, 1 fv, 3 ov with the 2nd and 4th operators replaced by equal classes"), but I cannot find any proof which can be shortened by it (I ran the minimize script for all proofs containing oveq, except for the fourier theorems).

Therefore, I wonder why this theorem found a way into @savask's list. Or is my reconstruction wrong?

benjub commented 2 years ago

I'll let @savask elaborate, but naively searching for "oveq2 oveq1d fveq2d oveq12d" in set.mm, I get five results, only fourierdlem (of course there could be more minimizable proofs since I did not account for line breaks, and labels do not even have to appear in that order, but that gives an idea of the kind of expressions dealt with). See for instance Step 49 in http://us2.metamath.org/mpeuni/fourierdlem96.html. So lemmas like that could be added by @glacode in his mathbox as a new fourierdlem (no need to say, an automatic minimization would take time...)

savask commented 2 years ago

Therefore, I wonder why this theorem found a way into @savask's list.

@benjub is right, this proof sequence appears in fourierdlem* theorems, namely in

fourierdlem26, fourierdlem51, fourierdlem71, fourierdlem89, fourierdlem90, fourierdlem91, fourierdlem96, fourierdlem97, fourierdlem98, fourierdlem99, fourierdlem105, fourierdlem106, fourierdlem108, fourierdlem110, fourierdlem115

The theorem does look weird, so maybe we can leave it to Glauco to decide.

glacode commented 2 years ago

As a side note, are you simply searching the text in set.mm ?

As far as I understand, a sequence of labels in a compressed proof, does not "translate" to that sequence existing in the uncompressed (real) proof.

metamath.exe and mmj2 (and the tool I'm working on) produce the same uncompressed proof, but different compressed proofs (because the order of the labels is "scrambled").

In other words, if you open mmj2 , do ctrl+g to load an existing proof and then ctrl+u, you will see that the uncompressed proof from metamath.exe is identical to the proof from mmj2, but the compressed proofs are not (for nontrivial proofs).

My tool does the exactly same: produces identical uncompressed proof, but different compressed proof.

But both mmj2 and my tool compressed proofs are accepted by metamath.exe (at least, for the tests I've run so far).

digama0 commented 2 years ago

The order of labels produced by mmj2 is not random, and in fact should be the same between metamath.exe and mmj2 (and metamath-knife). If I recall correctly, it sorts the labels by number of references, then uses a knapsack algorithm to pack the labels nicely inside the allowed width.

savask commented 2 years ago

As a side note, are you simply searching the text in set.mm ?

No. There are some details in the original google groups post, but essentially I take a compressed proof, parse it into a list of numbers, convert each number to the relevant theorem label (or put a blank, if it refers to some earlier step or hypothesis), and collect subsequences of labels which are not separated by blanks.

Such a subsequence does correspond to an uncompressed proof, so everything is legit.

avekens commented 2 years ago

I tried a second variant of a reconstruction from line 55 (["oveq2","oveq1d","fveq2d","oveq1d","oveq12d"]), which actually has a hypothesis, as indicated in the list:

    ov2fvov3eq24d.1 $e |- ( B = X -> A = Y ) $.
    ov2fvov3eq24d $p |- ( B = X
                      -> ( A .o. ( ( F ` ( ( C .x. B ) .X. D ) ) .+ E ) )
                       = ( Y .o. ( ( F ` ( ( C .x. X ) .X. D ) ) .+ E ) ) ) $=

This theorem does look even more weird (what should `( B = X -> A = Y ) be good for?), and still cannot be used for minimizations (except for fourier* theorems?). Therefore, I will quit working on these theorems.

Then I looked at line 58 (["oveq2","eqeq1d","anbi12d","ralbidv"]), and I got the following theorem in a first try:

    2ovanraleqv $p |- (  B = X
                   -> ( A. x e. V ( ( A .x. B ) = C /\ ( D .X. B ) = E )
                    <-> A. x e. V ( ( A .x. X ) = C /\ ( D .X. X ) = E ) ) ) $=

With this theorem, without hypotheses, no proof can be shortened (except maybe proofs for fourier*).

Then I tried the following variant (with one hypothesis):

    $d B x $.  $d X x $.
    ovanraleqv.1 $e |- ( B = X -> ( ph <-> ps ) ) $.
    ovanraleqv $p |- ( B = X -> ( A. x e. V ( ph /\ ( A .x. B ) = C )
                               <-> A. x e. V ( ps /\ ( A .x. X ) = C ) ) ) $=
      ( wceq co wa oveq2 eqeq1d anbi12d ralbidv ) EIKZADEGLZFKZMBDIGLZFKZMCHRAB
      TUBJRSUAFEIDGNOPQ $.

Having this theorem, actually 21 proofs can be shortened (saving 774 bytes), so I will provide it (together with the shortened proofs) in my next PR.

My latest experiments have shown that it is not (always) easy to reconstruct a useful theorem from @savask's list, and each one must be analysed very carefully.

savask commented 2 years ago

@avekens Sorry, I didn't pay enough attention to the precise statement of ov2fvov3eq24 in your previous post. Yes, your previous reconstruction was incorrect (you used oveq2 twice). You can see it both in mmj2 and in the uncompressed proof.

The ov2fvov3eq24d version is correct and it does minimize, for example, fourierdlem26.

Then I looked at line 58 (["oveq2","eqeq1d","anbi12d","ralbidv"]), and I got the following theorem in a first try:

Again, the second column at the gist says that you should have 1 essential hypothesis, so this already says that the reconstruction is incorrect.

My latest experiments have shown that it is not (always) easy to reconstruct a useful theorem from @savask's list, and each one must be analysed very carefully.

While I agree that my list doesn't always have useful theorems, reconstructing them correctly should be easy. Unfortunately my tool isn't developed enough to do that automatically, but if anyone is interested, I could try supplying each line with a mmj2 proof sheet, so reconstructing theorems would boil down to copying the proof sheet to mmj2, pressing Ctrl-U and renaming variables to your taste.

benjub commented 2 years ago

Thanks @avekens for all this work. Feel free to announce here the theorems you will work on. As for me, and unless you indicate that you are already planning on doing them, I'm going to look at lines: 1, 8, 9, 10, 11, 15, 22. (Note: line 5 has a big score but uses a theorem in a mathbox, so only shortens mathbox theorems; I may provide it, not sure yet)

avekens commented 2 years ago

@benjub : I will look at 63, 69, 90, 91 next, as you proposed.

glacode commented 2 years ago

If I recall correctly, it sorts the labels by number of references, then uses a knapsack algorithm to pack the labels nicely inside the allowed width.

this is a useful piece of information, I'm going to change my algorithm to produce the compressed proof.

I've noticed (see example below) that not only mmj2 and metamath.exe produce different proofs, but mmj2 compressed proof produced in the GUI is different from the one produced in the .mmt file (that's the one I usually insert in set.mm)

Here's a simple example:

  1. metamath output

MM> show proof cbviota /compressed Proof of "cbviota": ---------Clip out the proof below this line to put it in the source file: ( vw vz weq wb wal cab cuni cio wsb nfv nfbi equequ1 bibi12d sbequ12 nfsb nfs1v cbval sbequ sbie syl6bb bitri abbii unieqi dfiota2 3eqtr4i ) ACHJZK ZCLZHMZNBDHJZKZDLZHMZNACOBDOUPUTUOUSHUOACIPZIHJZKZILUSUNVCCIUNIQVAVBCACIU CVBCQRCIJAVAUMVBACIUACIHSTUDVCURIDVAVBDACIDFUBVBDQRURIQIDJZVABVBUQVDVAACD PBAIDCUEABCDGEUFUGIDHSTUDUHUIUJACHUKBDHUKUL $. ---------The proof of "cbviota" (336 bytes) ends above this line.

  1. the mmj2 proof in the GUI
glacode commented 2 years ago

(sorry, it was unintentionally submitted)

If I recall correctly, it sorts the labels by number of references, then uses a knapsack algorithm to pack the labels nicely inside the allowed width.

this is a useful piece of information, I'm going to change my algorithm to produce the compressed proof.

I've noticed (see example below) that not only mmj2 and metamath.exe produce different proofs, but mmj2 compressed proof produced in the GUI is different from the one produced in the .mmt file (that's the one I usually insert in set.mm)

Here's a simple example:

  1. metamath output MM> show proof cbviota /compressed Proof of "cbviota": ---------Clip out the proof below this line to put it in the source file: ( vw vz weq wb wal cab cuni cio wsb nfv nfbi equequ1 bibi12d sbequ12 nfsb nfs1v cbval sbequ sbie syl6bb bitri abbii unieqi dfiota2 3eqtr4i ) ACHJZK ZCLZHMZNBDHJZKZDLZHMZNACOBDOUPUTUOUSHUOACIPZIHJZKZILUSUNVCCIUNIQVAVBCACIU CVBCQRCIJAVAUMVBACIUACIHSTUDVCURIDVAVBDACIDFUBVBDQRURIQIDJZVABVBUQVDVAACD PBAIDCUEABCDGEUFUGIDHSTUDUHUIUJACHUKBDHUKUL $. ---------The proof of "cbviota" (336 bytes) ends above this line.

  2. the mmj2 proof in the GUI $= ( vw vz cv wceq wb wal cab cuni cio wsb nfv nfbi equequ1 syl6bb nfs1v sbequ12 bibi12d cbval nfsb sbequ sbie bitri abbii dfiota2 unieqi 3eqtr4i ) ACJZHJZKZLZCMZHNZOBDJZUOKZLZDMZHNZOACPBDPUSVDU RVCHURACIQZIJZUOKZLZIMVCUQVHCIUQIRVEVGCACIUBVGCRSUNVFKAVEUPVGAC IUCCIHTUDUEVHVBIDVEVGDACIDFUFVGDRSVBIRVFUTKZVEBVGVAVIVEACDQBAID CUGABCDGEUHUAIDHTUDUEUIUJULACHUKBDHUKUM $. $)

  3. the proof in the .mmt file

    ${ $d ph w z $. $d ps w z $. $d w x z $. $d w y z $. cbviota.1 $e |- ( x = y -> ( ph <-> ps ) ) $. cbviota.2 $e |- F/ y ph $. cbviota.3 $e |- F/ x ps $. $( Change bound variables in a description binder. (Contributed by Andrew Salmon, 1-Aug-2011.) $) cbviota $p
    |- ( iota x ph ) = ( iota y ps ) $= ( vw vz cv wceq wb wal cab cuni cio wsb nfv nfbi equequ1 nfs1v sbequ12 nfsb bibi12d cbval sbequ sbie syl6bb bitri abbii unieqi dfiota2 3eqtr4i ) ACJZHJZKZLZCMZHNZOBDJZUOKZLZDMZHNZOACPBDPUSVDURVCHURACIQZIJZUOKZLZIMVCUQV HCIUQIRVEVGCACIUAVGCRSUNVFKAVEUPVGACIUBCIHTUDUEVHVBIDVEVGDACIDFUCVGDRSVBI RVFUTKZVEBVGVAVIVEACDQBAIDCUFABCDGEUGUHIDHTUDUEUIUJUKACHULBDHULUM $.

    $}

savask commented 2 years ago

Hopefully this file will make creating theorems from the list more convenient:

https://gist.github.com/savask/7be35c8c019c1750e25d1adbf252f8d9

If you want to work on line N, find the respective snippet in the file above and do as follows:

  1. Copy the "MMP FILE" part into mmj2 and press Ctrl-U;
  2. Set all work variables to your taste;
  3. Rename steps without theorem labels (hypotheses) into h1, h2, ..., and rename the last step to qed;
  4. Add the theorem to set.mm and copy the proof from mmj2, and don't forget to remove test.1, test.2, ... from the compressed proof.
benjub commented 2 years ago

Thanks @savask, this might come in handy. For the moment, I'm doing the smaller ones, so I guess the statement and I am using only metamath.exe. Never used the MMP thing but I may give it a try.

avekens commented 2 years ago

@benjub when do you plan to finish/pause your minimization activities? I would like to continue this evening/tomorrow, based on the current metamath:develop-branch. To avoid merge conflicts, we should not minimize a lot of proofs in parallel. The currently open PR #2772 should not be harmful.

avekens commented 2 years ago
* 55 TODO AV

@savask : I will not continue with this item. If at all, GS should care about this (only his mathbox is affected).

benjub commented 2 years ago

@avekens : I'm done for now (latest is the currently open #2776 / Line 22). After it's merged, go ahead, I'll give a warning here before resuming. I edited the original post above regarding Line 55; feel free to edit it if it's not a correct description. (Indeed #2772 should not create merge conflicts.)

edit: and feel free to "claim" the unattributed TODOs in the list, or even other lines, by editing the list.

avekens commented 2 years ago

I reconstructed No. 69 ["oveq2","oveq1d","fveq2d","oveq1d","oveq12d","cbvmptv"]:

    $d .+ x y  $.  $d .X. x y $.  $d .o. x y $.  $d .x. x y $.  $d A y $.
    $d B x $.  $d C x y $.  $d D x y $.  $d E x y $.  $d F x y $.  $d X x y $.
    cbvmptfv4oveq3v.1 $e |- ( x = y -> A = B ) $.
    cbvmptfv4oveq3v $p |- ( x e. X 
                      |-> ( A .o. ( ( F ` ( ( C .x. x ) .X. D ) ) .+ E ) ) )
          = ( y e.  X |-> ( B .o. ( ( F ` ( ( C .x. y ) .X. D ) ) .+ E ) ) ) $=
      ( cv co cfv wceq oveq2 oveq1d fveq2d oveq12d cbvmptv ) A
    BLCEAOZHPZFIPZKQZJGPZMPDEBOZHPZFIPZKQZJGPZMPUDUIRZCDUHUMMNUNUGULJ
    GUNUFUKKUNUEUJFIUDUIEHSTUATUBUC $.

No proof of any theorem (except maybe fourier*) can be shortened by this theorem. It seems to be a case like No. 55, so I will not put it to set.mm. Unfortunately, I cannot edit the TODO-List at the top (missing permissions?)

benjub commented 2 years ago

Looking at Step 40 in http://us2.metamath.org/mpeuni/fourierdlem108.html and Step 39 in http://us2.metamath.org/mpeuni/fourierdlem110.html, the hypothesis is in both cases an instance of ~id. But this would probably not yield very interesting things. Shoud I write "discarded as too specialized" ?

As for the permissions: the permission to edit probably goes with that of merging PRs. If you feel like it, you may ask Mario or David to add you (I think they are the two persons that can do that).

savask commented 2 years ago

@benjub I suggest to remove all TODOs and "score lowered by line ..." from the list. If someone wants to work on a range of theorems, they can show their intent in a comment (it seems that only you and @avekens are working on it right now, so there shouldn't be much misunderstanding). Turning my original post into a collaborative blackboard sounds like a bad idea.

As for the score, it was computed on an older version of set.mm so the addition of new theorems, other minimizations etc might have also pushed the score in either direction; recording all the possible reverberations seems impossible. For example, if some theorem (say, line 1) appears in another (say, line 11), it does not necessarily mean that the score of line 11 went down, it simply means that line 11 now has a shorter proof in terms of line 1, but the statement itself might appear as often in set.mm as before.

@avekens

No proof of any theorem (except maybe fourier*) can be shortened by this theorem. It seems to be a case like No. 55, so I will not put it to set.mm.

I agree that this particular instance doesn't seem very interesting, but I wanted to mention that not all fourier* theorems are super large and actually minimize_with can work with many (if not most) of them in reasonable time.

benjub commented 2 years ago

@savask : I find the TODOs a good way to centralize the information of what is left to do and by whom, instead of having it spread in some comments. My idea was that after we finish this batch (6 TODOs left), you re-run your program to have an updated list with updated scores. That said, it's your post, so you get the final word on the edit.

avekens commented 2 years ago

I reconstructed No. 90 ["fveq2","oveq1d","fveq2d","breq1d","imbi12d"]:

    imbrov2fvoveq.1 $e |- ( X = Y -> ( ph <-> ps ) ) $.
    imbrov2fvoveq $p |- ( X = Y -> ( ( ph -> ( F ` ( ( G ` X ) .x. O ) ) R A )
                           <-> ( ps -> ( F ` ( ( G ` Y ) .x. O ) ) R A ) ) ) $=
    ( wceq cfv co wbr fveq2 oveq1d fveq2d breq1d imbi12d ) I
JLZABIGMZHENZFMZCDOJGMZHENZFMZCDOKUAUDUGCDUAUCUFFUAUBUEHEIJGPQRST
    $.

but the proof can be shortened using fvoveq1d:

      ( wceq cfv co wbr fveq2 fvoveq1d breq1d imbi12d ) IJLZABIGMZHENFMZCDOJGMZ
      HENFMZCDOKTUBUDCDTUAUCHFEIJGPQRS $.

Using this theorem, 17 proofs can be shortened, saving 806 bytes. I will provide this theorem and the shortenings with my next PR.

savask commented 2 years ago

@benjub

I find the TODOs a good way to centralize the information of what is left to do and by whom, instead of having it spread in some comments.

I'm okay with leaving TODOs, but we should at least put a name next to all TODOs (does an empty TODO mean it's you?).

My idea was that after we finish this batch (6 TODOs left), you re-run your program to have an updated list with updated scores.

I suggest we don't rerun the tool and work with this list as is. As David mentioned in the google groups, this list should be used as a list of theorem ideas, and not as a goal by itself. If we rerun the tool, some lines will change places, some will change proofs (without changing the end statement) and so on, which will complicate keeping progress.

@avekens

I reconstructed No. 90 ["fveq2","oveq1d","fveq2d","breq1d","imbi12d"]

Did you try the gist with mmj2 proof files from my earlier comment, or you did the reconstruction by hand? Either way is good, but I'm interested in "user experience" :-)

avekens commented 2 years ago

@avekens

I reconstructed No. 90 ["fveq2","oveq1d","fveq2d","breq1d","imbi12d"]

Did you try the gist with mmj2 proof files from my earlier comment, or you did the reconstruction by hand? Either way is good, but I'm interested in "user experience" :-)

I did it by myself (it's fun!) and compared the result with your proposal: they were identical.

benjub commented 2 years ago

As for the unattributed TODOs (Lines 27, 47, 87): I spotted them as interesting line, but @savask is right, they should be attributed! @avekens: can you tell me the ones you can do ? I'll do the rest (no problem if you answer "all three" or "none", just let me know). I am not doing any for now to let you do 69, 90, 91. @savask : I was thinking that once these TODOs are done, we can consider it a first milestone and update the list and the scores at that point. Admittedly, this is a subjective point. It can be at a different moment, or something else. I added a final line to the original post a few days ago; feel free to edit it or change it altogether if you see things differently.

david-a-wheeler commented 2 years ago

Hi, I just wanted to comment that this really awesome. Thanks everyone!!

avekens commented 2 years ago

I am just about to look at no. 91 (does not seem very promising). Afterwards, I will look at lines 27, 47, 87 (I started with no. 47 already - this seems to be interesting).

As I wrote before, no. 69 will not be investigated further (too specialized), and no. 90 is already done.

savask commented 2 years ago

@benjub

I was thinking that once these TODOs are done, we can consider it a first milestone and update the list and the scores at that point.

I understand the idea, but why? I don't get what's the point in regenerating the list. Some lines will disappear (since respective theorems were added to set.mm), some will get lower score. As a result, we'll get several new suggestions at the bottom of the list, but those will probably be discarded anyway.

I added a final line to the original post a few days ago; feel free to edit it or change it altogether if you see things differently.

I will remove it together with "score lowered by" lines.

I suggest we look through the list and add theorems which are interesting from the mathematical standpoint (ignoring weird lemmata which are applicable only to fourierdlem* and so on). After every theorem has been looked at, I will regenerate the list but will show suggestions only with "high" score, for example, above 100.

By the way, maybe there are some ideas on how to improve the score function (say, should we use the number of occurrences instead)? I'm open to suggestions from everyone.

avekens commented 2 years ago

I reconstructed No. 91 ["oveq1","eleq1d","rexbidv","cbvrabv","uneq2i"]:

  ${
    $d B x y $.  $d C x y $.  $d D x y $.  $d X x y $.  $d .o. x y $.  $d x y z $.
    rexovelcbvrabv $p |- { x e. B | E. z e. C ( x .o. X ) e. D }
                       = { y e. B | E. z e. C ( y .o. X ) e. D } $=
      ( cv co wcel wrex weq oveq1 eleq1d rexbidv cbvrabv ) AIZGHJZFKZCELBIZGHJZ
      FKZCELABDABMZTUCCEUDSUBFRUAGHNOPQ $.

    unrexovelcbvrabv  $p |- ( A u. { x e. B | E. z e. C ( x .o. X ) e. D } )
                          = ( A u. { y e. B | E. z e. C ( y .o. X ) e. D } ) $=
      ( cv co wcel wrex crab rexovelcbvrabv uneq2i ) AJHIKGLCFMAENBJHIKGLCFMBEN
      DABCEFGHIOP $.
  $}

unrexovelcbvrabv corresponds to no. 91, but I hoped that the version without the union may be more appropriate. I checked all proofs using oveq1 (except the proofs of the fourier theorems) if they can be minimized using rexovelcbvrabv,unrexovelcbvrabv: there were none.

Then I checked all proofs using uneq2 (including the proofs of the fourier theorems!) if they can be minimized using unrexovelcbvrabv: The following 10 theorems could be minimized: fourierdlem89, fourierdlem91, fourierdlem96, fourierdlem97, fourierdlem98, fourierdlem99, fourierdlem108, fourierdlem109, fourierdlem110, fourierdlem112

Since only fourier* theorems are affected, I think both theorems are too specialized and should not be added to main set.mm. (@glacode feel free to add them to your mathbox).

avekens commented 2 years ago

By the way, maybe there are some ideas on how to improve the score function (say, should we use the number of occurrences instead)? I'm open to suggestions from everyone.

Can theorems in mathboxes be excluded? We should concentrate on improving the main body of set.mm. For theorems in mathboxes, their owners are responsible. To add the number of occurences would be great, but then also the theorems in which the patterns occur can/should be provided.

benjub commented 2 years ago

@savask : yes some lines would disappear from the updated list, so this would remove some clutter and help concentrate on the ones that remain. Also, if I remember your initial post, you took the top 20 of lemmas of each length, and then merged the these top 20's. Maybe you could just make one global top 100, for instance.

You write "after every theorem has been looked at, I will regenerate...". As for me, every theorem has been looked at (these were the TODOs in the above list). Of course, there are probably more theorems that I missed since I gave only a cursory look at theorems with higher scores and at most two hypotheses, but as for me, I'm done, at least for the first pass.

Score function: I thought a bit about it but haven't found anything interesting. I think keeping it simple and using your score function (length-1)*(numbers of thms minimized) is good.

savask commented 2 years ago

@avekens

Can theorems in mathboxes be excluded? We should concentrate on improving the main body of set.mm. For theorems in mathboxes, their owners are responsible.

The idea is to identify "interesting" theorems, so I think the program should look in mathboxes. I suppose your concern is omitting lemmata useful only in restricted settings (like minimizing fourierdlem*), but these should have low score anyway.

To add the number of occurences would be great, but then also the theorems in which the patterns occur can/should be provided.

I can add a list of occurrences to each lemma.

@benjub

As for me, every theorem has been looked at (these were the TODOs in the above list).

There are some theorems early in the list which don't have any comment in the first post (for example, line 2). Did I miss something?

Anyway, I'm working on the new batch of lemmata, and I decided to improve the score function. Now it tries to estimate the number of bytes saved when minimizing the proof. The exact algorithm is a bit messy since my program doesn't perform actual minimization, but it takes into account several parameters:

Unfortunately this is far from being an exact estimate on the number of saved bytes, for example elicc01 (no. 47) added by Alexander in #2782 saves 775 bytes, while my algorithm estimates it as only 447.

The list generated with this heuristics contains new lemmata not present in the old list. For example, this has the score of 1120:

${
  test.1 $e  |- ( ph -> A = B ) $.
  test $p |- ( ph -> ( A = C <-> B = C ) ) $=
    ( fvex2 eqeq1d ) ABCDEF $.
$}

There are many similar lemmata in the list which are just deduction versions of already existing theorems. Here are the top 10 suggestions:

3013    5   ["anbi12d","rspcev","syl12anc"]
2200    3   ["rspcev","syl2anc"]
1741    2   ["anbi12d","rspcev"]
1736    3   ["ralbidv","rspcev","syl2anc"]
1681    1   ["ssrab2","eqsstri"]
1651    1   ["ralbidv","rspcev"]
1623    4   ["rspcev","syl12anc"]
1545    3   ["eqeq2d","rspcev","syl2anc"]
1472    9   ["latjle12","syl13anc","mpbi2and"]
1463    2   ["eldifsn","sylanbrc"]

What do you think about this? The full list of theorems with score at least 400 is not small (272 lines), so perhaps it is better to include in the list only those theorems which have at most 2 hypotheses?

benjub commented 2 years ago

Mathboxes: I agree with @savask for not excluding mathboxes.

Score function: it's nice that you have a more accurate estimate. Since the score is only an indicator and comes only second to human decision, having a very precise estimate beyond what you propose is not that important (at least in the present case), so I think your method is good enough. The new cases it gives are interesting, and deduction versions of existing theorems with high scores should definitely be added. As for hypotheses: maybe you can put a "penalty" on hypotheses as follows: if the score is $z$ and there are $e$ hypotheses, then maybe display (and rank according to) $z - 10 e^2$ or something like that (tune the coefficient, but I think something quadratic or cubic in $e$). This is of course rather arbitrary since not all hypotheses are "equal" (e.g. definitional hypotheses).

When to generate a new list: you didn't miss anything: when I was saying "as for me, this is done", of course it's only my cursory skimming through the list, so there could be some lines that I missed. So I think there is a not too distant point when a new list with updated scores should be regenerated, but precisely when is up to you.

avekens commented 2 years ago

@savask @benjub Although my TODOs (as indicated above) will be done with PR #2782, I want to have a look at some additional lines, at least at 28, 77 (they really look promising), and maybe 30, 39, 54, 62. Finally, I want to minimize proofs using ~2fveq3 as indicated in PR #2768. Therefore, I want to ask you to delay any activities (additional minimizations, creation of a new list, etc.) until I finished my tasks.

benjub commented 2 years ago

@savask @benjub Although my TODOs (as indicated above) will be done with PR #2782, I want to have a look at some additional lines, at least at 28, 77 (they really look promising), and maybe 30, 39, 54, 62. Finally, I want to minimize proofs using ~2fveq3 as indicated in PR #2768. Therefore, I want to ask you to delay any activities (additional minimizations, creation of a new list, etc.) until I finished my tasks.

Sure, go for it (and thanks!). I'm updating the list above with TODOs for 28 and 77 (and when you confirm the other ones, I'll add them too).

avekens commented 2 years ago

As for the permissions: the permission to edit probably goes with that of merging PRs. If you feel like it, you may ask Mario or David to add you (I think they are the two persons that can do that).

@digama0 @david-a-wheeler could one of you give me the required permissions, as Benoît proposed, please? I do not exactly know how this works, but the same permissions as for Benoît and Wolf seem to be appropriate. Thanks in advance.

Alexander

benjub commented 2 years ago

Apparently I missed Line 40: one hypothesis and score 98:

$d  xyA  xB  xC  xps  xch
$e |- ( x = A -> ( ph <-> ps ) )
$p |- (  ( A e. B /\ A. y e. C ( ps -> ch ) ) -> E. x e. B A. y e. C ( ph -> ch ) )

I'll try to minimize with it and if good enough, add it.

avekens commented 2 years ago

With PR #2787, I am done with my minimization tasks. I would propose to close this issue after @benjub has finished his work on line 40, and open a new issue with a fresh list (and new tasks...).

benjub commented 2 years ago

@avekens: ok, I'll do Line 40 by tomorrow.

benjub commented 2 years ago

I'll do Line 40 by tomorrow.

Done in #2794. Since @avekens and I are done, and there has not been a lot of additions by others, I think we are more or less done for now ? I would like that once #2794 is merged, @savask regenerate a list using the same scoring function (in this same issue), because I would like to see how the scores of some lines were lowered by the new additions (e.g., are Lines 11 and 25 still worth considering after the score lowering?). The only changes to the list generation method could be to take the global top-120 instead of the top-20 of each length, and maybe to cap the length at 8 or 9 instead of 7 (mostly to check that no such long proof skeletons appear)?

After that, a new issue with @savask's new method could be opened.

savask commented 2 years ago

@benjub As I understand, now is the time to regenerate the list with the old scoring function. Here's the full list of all lemmata suggestions with score at least 20, length is capped at 9:

https://gist.github.com/savask/1ca6247373093efbed6955489832a8c6

The list is long, and top 120 suggestions have skeleton lengths at most 3. I think the lesson here is that it's more interesting to sample different lengths rather than build a global list.

benjub commented 2 years ago

Thanks @savask, it's very nice to have this long list. Even if most will be discarded in the end, it gives us an idea of the landscape, and why we should prefer this or that sampling method in the future.

There are three lines with score above 200 and at most one essential hypothesis:

212 0 ["fveq2","eqeq1d"] (edit: added as fveqeq2(d) in #2804 and #2806) 210 1 ["eqeq2d","rspcev"] (edit: added as rspceeqv in #2811) 205 0 ["ssid","a1i"] (edit: added as ssidd in #2807)

which we should probably add (or at least try).

Probably not to be added, but these ones are interesting: the longest skeletons (for scores >3.7=21), with for some many essential hyps!

42 9 ["a2d","syl5","expcom","a2d","adantl","findcard2s","mpcom","mpi"] 35 8 ["nfres","nfop","nfcxfr","nffv","nfop","nfsn","nfun","nfcxfr"] 32 2 ["2cn","4cn","4ne0","divcan3i","eqtri","a1i","oveq1d","eqtrd","oveq1d"] 32 4 ["4cn","4ne0","divcan3i","eqtri","a1i","oveq1d","eqtrd","oveq1d","eqtrd"] 32 5 ["4ne0","divcan3i","eqtri","a1i","oveq1d","eqtrd","oveq1d","eqtrd","fveq2d"] 28 4 ["mptpreima","syl6eq","eleq1d","syl5ibrcom","expimpd","rexlimdvva","syl5bi","ralrimiv"] 28 8 ["rexeqbidv","rexeqbidv","rexeqbidv","rspcev","syl2anc","ex","impbid","syl5bb"]

Probably the theorems they shorten are very similar theorems within families or within chains of lemmas...

benjub commented 2 years ago

Side note: The comparison of the two lists shows that there are some shortenings with consequences we didn't foresee. For instance, here are two lines (one from each file: line 17 in the first list and line 247 in the second list):

242 3 ["ralbidv","rspcev","syl2anc"] 78 3 ["ralbidv","rspcev","syl2anc"]

That is, the score was much lowered, but we didn't notice it first. Not sure what happened precisely, although both "rspcev","syl2anc" and "ralbidv","rspcev" appear on several lines (and we added to set.mm theorems corresponding to lines 27, 40, 87 of the first list).

benjub commented 2 years ago

@savask, @avekens : once #2811 is merged, I'm done (see two comments above, that I edited with links to the PRs). So as for me, I'm done with both lists for now. (I downloaded them just in case.) I let @savask close the issue when he sees fit (and thanks again!).

savask commented 2 years ago

I think everyone is a bit tired already of this list, especially since it turned out to contain way less interesting theorems which (at least I) anticipated. I will close this issue and perhaps will open a new one later, computed with a new strategy. Thanks to everyone for the effort.