metamath / metamath-knife

Metamath-knife can rapidly verify Metamath proofs, providing strong confidence that the proofs are correct.
Apache License 2.0
25 stars 10 forks source link

Proof compression might be improved #29

Closed tirix closed 2 years ago

tirix commented 2 years ago

The metamath-knife program now produces correct proofs, however it seems their size is a bit longer than compressed proofs generated by MMJ2 or the metamath program.

tirix commented 2 years ago

@digama0 asks about the knapsack algorithm:

Do the generated proofs have a ragged right side? MMJ2 is using a knapsack algorithm to optimize the packing of the labels and make them look more "neat", and IIRC metamath.exe copied that implementation.

metamath-knife does implement a knapsack algorithm. The proof generated for ~ ellimc fits nicely the right margin:

      ( co wcel cv wceq vy cfv cif cmpt cc wss syl3anc climc csn cun ccnp wf wa
      limcfval simpld eleq2d wi wb limcvallem mpteq2dv cab ifeq1 syl6eqr eleq1d
      elab3g syl bitrd ) AEFDUGPZQEBCDUHUIZBRZDSZTRZVIFUAZUBZUCZDHIUJPUAZQZTUTZ
      QZGVOQZAVGVQEAVGVQSZVGUDUEZACUDFUKZCUDUEZDUDQZVTWAULMNOTBCDFHIJKUMUFUNUOA
      VSEUDQUPZVRVSUQAWBWCWDWEMNOBCDEFGHIJKLURUFVPVSTEUDVKESZVNGVOWFVNBVHVJEVLU
      BZUCGWFBVHVMWGVJVKEVLVAUSLVBVCVDVEVF $.

However the proof generated for ~ numclwlk1lem2f1 does not:

      ( wcel cfv co cmin wceq wi wa c1 adantl cusg wbr c3 cuz w3a c2 cop cnbgra
      cxp wf vp cv vu weq wral numclwlk1lem2f wf1 cc0 csubstr numclwlk1lem2fv
      imp adantrr adantrl ovex fvex cword chash eqeq12d opth uzuzle23
      numclwwlkovgelim syl3an3 clsw clt wb simpll simprll ad2antrl eleq1 eqcoms
      cz cle eluz2 1red 3re a1i zre 3jca 1lt3 ltletr expd mpisyl 3adant1 sylbi
      syl6bi com12 3ad2ant3 ad3antlr 2swrd2eqwrdeq syl3anc cr impcom eqtr3
      expcom adantr biantrurd 3anan12 eqeq2 oveq1 fveq2d eqeq1d biimpd eqeq1
      sylan9eq ex eqtr4d opeq2d oveq2d lsw eqeq2d mpbird eqeqan12d anbi12d
      3bitr2d exbiri syl2and syl5bi sylbid ralrimivva dff13 sylanbrc ) JFUEUFZK
      JPZIUGUHQZPZUIZKIHRZKIUJSRZGRJFUKKULRUMZDUNUOUPZDQZUQUPZDQZTZUOUQURZUAZUQ
      UUAUSUOUUAUSUUAUUCDVAABCDEFGHIJKLMNOUTYTUUJUOUQUUAUUAYTUUDUUAPZUUFUUAPZUB
      ZUBZUUHUUDVBUUBUKZVCRZIUCSRZUUDQZUKZUUFUUOVCRZUUQUUFQZUKZTZUUIUUNUUEUUSUU
      GUVBYTUUKUUEUUSTZUULYTUUKUVDABCUUDDEFGHIJKLMNOVDVEVFYTUULUUGUVBTZUUKYTUUL
      UVEABCUUFDEFGHIJKLMNOVDVEVGVLUVCUUPUUTTZUURUVATZUBZUUNUUIUUPUURUUTUVAUUDU
      UOVCVHUUQUUDVIVMYTUUMUVHUUIUAZYTUUKUUDJVJZPZUUDVKQZITZUBZVBUUDQZKTZUUBUUD
      QZUVOTZUBZUBZUULUUFUVJPZUUFVKQZITZUBZVBUUFQZKTZUUBUUFQZUWETZUBZUBZUVIYSYP
      YQIUJUHQPZUUKUVTUAIVNZABCUUDEFGHIJKLMNVOVPYSYPYQUWKUULUWJUAUWLABCUUFEFGHI
      JKLMNVOVPYTUVTUWJUBZUUIUVHYTUWMUBZUUIUVLUWBTZUUDVBUVLUJSRZUKZVCRZUUFUWQVC
      RZTZUWPUUDQZUWPUUFQZTZUUDVQQZUUFVQQZTZUIZUBZUXGUVHUWNUVKUWAUCUVLVRUFZUUIU
      XHVSUVTUVKYTUWJUVKUVMUVSVTWBUWMUWAYTUVTUWAUWCUWIWAUDUWMYTUXIUVMYTUXIUAUVK
      UVSUWJYTUVMUXIYSYPUVMUXIUAYQUVMYSUXIUVMYSUVLYRPZUXIYSUXJVSIUVLIUVLYRWCWDU
      XJUGWEPZUVLWEPZUGUVLWFUFZUIUXIUGUVLWGUXLUXMUXIUXKUXLUXMUXIUXLUCXEPZUGXEPZ
      UVLXEPZUIZUCUGVRUFZUXMUXIUAUXLUXNUXOUXPUXLWHUXOUXLWIWJUVLWKWLWMUXQUXRUXMU
      XIUCUGUVLWNWOWPVEWQWRWSWTXAWTXBXFUUFJUUDXCXDUWNUWOUXGUWMUWOYTUVTUWJUWOUVN
      UWJUWOUAZUVSUVMUXSUVKUWJUVMUWOUWDUVMUWOUAZUWIUWCUXTUWAUVMUWCUWOUVLUWBIXGX
      HUDXIWTUDXIVEUDXJUWNUXGUXCUWTUXFUBZUBZUYAUVHUXGUYBVSUWNUWTUXCUXFXKWJUWNUX
      CUYAUWMUXCYTUWMUXAKUXBUVTUXAKTZUWJUVNUVSUYCUVMUVSUYCUAUVKUVSUVMUYCUVPUVRU
      VMUYCUAZUVPUVRUVQKTZUYDUVOKUVQXLUVMUYEUYCUVMUYEUYCUVMUVQUXAKUVMUUBUWPUUDU
      UBUWPTIUVLIUVLUJSXMWDXNXOXPWTWSVEWTUDVEXIUVTUWJUXBKTZUVNUWJUYFUAZUVSUVMUY
      GUVKUVMUWJUYFUVMUWJUXBUWGKUVMUWPUUBUUFUVLIUJSXMZXNUWIUWGKTZUWDUWHUWFUYIUW
      HUWFUYIUWFUYIVSUWEUWGUWEUWGKXQWDXPXFUDXRXSUDXIVEXTUDXJUWNUWTUVFUXFUVGUWMU
      WTUVFVSZYTUVMUYJUVKUVSUWJUVMUWRUUPUWSUUTUVMUWQUUOUUDVCUVMUWPUUBVBUYHYAZYB
      UVMUWQUUOUUFVCUYKYBVLXBUDUWMUXFUVGVSYTUVTUWJUXDUURUXEUVAUVNUXDUURTUVSUVKU
      VMUXDUVLUCSRZUUDQUURUUDUVJYCUVMUYLUUQUUDUVLIUCSXMXNXRXIUWDUXEUVATZUWIUWDU
      YMUXEUWBUCSRZUUFQZTZUWAUYPUWCUUFUVJYCXIUWCUYMUYPVSUWAUWCUVAUYOUXEUWCUUQUY
      NUUFUUQUYNTIUWBIUWBUCSXMWDXNYDUDYEXIYFUDYGYHYHYIYJVEYKYLYMUOUQUUAUUCDYNYO
      $.

It seems that metamath-knife only runs the knapsack algorithm on a subset of the labels. First on the first twenty, then on the next hundred. I believe this is a trade-off to avoid shuffling too much the statements, which are ordered by how often they are referenced later in the proof. This is important for the first twenty statements, which only need one letter. That may limit the algorithm.

Also, in this particular example, there are very long labels, like ~ 2swrd2eqwrdeq, which probably also make the knapsack's algorithm's work harder.

tirix commented 2 years ago

There must be something else beyond the knapsack algorithm, though. With the proof of ~ ellimc, for which the label list is correctly packed (but ordered differently), the encoded proof is 4 characters longer than the original proof in set.mm.

In my experiments, I let metamath-knife read and parse the original proof, but it then rebuilds the compressed proof from scratch.

digama0 commented 2 years ago

I believe it would be a good idea to copy the algorithm from mmj2/metamath, mostly for byte-reproducibility rather than efficiency per se.

IIRC the sorting is limited to within groups 1-20, 21-120, etc as you say, since we don't want to make the compressed text longer either. That doesn't explain the result of numclwlk1lem2f1 though, because there are 39 words before the big word numclwwlkovgelim that causes the gap, and you can clearly do better by shuffling some of the other words around it (single big words aren't usually a problem, you usually only get bad results when there aren't enough small words). In fact, from these examples I would be more likely to guess that it is a simple greedy algorithm, and ellimc just got lucky. Big words causing a ragged right is a characteristic of the greedy algorithm.