metamath / set.mm

Metamath source file for logic and set theory
Other
251 stars 88 forks source link

Feedback from a new user, improvement suggestions. #3990

Open metakunt opened 5 months ago

metakunt commented 5 months ago

I don't have a google account and I don't know if it is possible to use the mailing list without one. If it is, please let me know the steps so that I can post it on the mailing list. Otherwise I would be happy if one of you could open a issue on the metamath mailing list on my behalf. I hope it is as simple as writing an E-mail to metamath@googlegroups.com.

Hey guys, I have been working for a few weeks on the project and as a new user I wish to share some feedback I have. I want to start this discussion to overall improve the state of metamath and make sure that it is easy and straightforward to use.

Many of you are either maintainers and/or are on the project for a long time. Also I will only be listing things that can be improved from my perspective, things that are positive/working are not listed. Those things include the helpful community and the fact that metamath is pretty fun to work on.

Here are the things that I had issues with and that I wish to improve in some way, listed in order of severity (most severe issues are listed first.)

-Lack of theorems in deduction form. Oh my goodness, I can't explain how much that one annoyed me. Once I have learned how deduction proofs work I fell in love with, it is simple, fast and has a great developer experience. The reason is once you have a theorem in deduction form the unifier will do most work for you, completing substeps automatically which I'd have to painstakingly find myself. It is just great and speeds up the development by one or two orders of magnitude per step. This adds up, my generous estimate would be that using theorems in deduction form is at least twice as fast as theorems in closed form, just by the virtue of autocompleting. But this is not all. The best thing about deduction proof (similar to an inference proof) is that it absorbs and scopes everything in context. That means that antecedents are all small, with some exceptions being conjunctions that eliminate themself eventually (classical case elimination). But this issue maps to closed form proofs so it is irrelevant. Point being it is easier for me to recreate a theorem in deduction form to use it, as I did in my mathbox, because it will save me time after just a few usages. Converting a theorem into deduction form with 5 or more conjuncts will amortize after 1 or 2 usages. Some lemmas which have lots of conditions are gcd lemmas, deduction forms are in my mathbox. I'd like to move them to main as they are way easier to use.

I have multiple suggestions here, my most preferred would be to rewrite simple results on complex numbers from closed form to deduction form and deprecate those closed form results and eventually delete them. (Section 5.2 and Section 5.3) I have the opinion that deduction form is superior to closed form in almost every other way and the benefits for moving in that direction would make proof development much faster. Since this is something I might be wrong with I would like to know if there are any downsides that would make this a bad idea.

I have checked randomly some theorems that have both a deduction and a closed form and I have found out that many theorems are referencing the deduction and not the closed form. If you consequently use deduction form you will get mmj2 autocomplete which in some cases autocompletes 20 steps. I really don't want to prove results that autocomplete can prove for me, this is a waste of time at best.

Here is one example https://us.metamath.org/mpeuni/add12.html 3 usages, 2 of which are proving inference and deduction version https://us.metamath.org/mpeuni/add12d.html 13 usages.

https://us.metamath.org/mpeuni/addcomd.html referenced theorems are 15 lines in my browser. Proof is done exclusively with theorems in deduction form while it's closed form only has 7 lines of usages

ltle 11 lines vs 21 lines for ltled addsub <1 line vs 5 lines for addsubd

The trend continues, there were even some theorems where the deduction form hat 100+ usages and the closed form not even 10.

I would also for consistency sake reformulate inference form theorems as deduction form. The decbld* theorems use those and if you rewrite those from base up you don't get the downside of always adding a1i to proofs. And should you ever need the form without the antecedent you can easily use mptru. Proof length should be around equal if it done consequently and the benefit is that future proofs in deduction forms don't need to use a1i on any hypotheses that are for future use.

Here are my suggestion as a bullet list:

Eventually: Rename deduction form without suffix, once inference and closed form are gone, see https://us.metamath.org/mpeuni/isumshft.html it is in deduction form without closed or inference form existing. My goal is to get there so that we have consistent theorems only in deduction form. My goal isn't to get rid of all inference and closed form theorems, just for some sections which I think benefit it the most.

The same I would also do for inference lemmas like 2+2=4 would become ph -> 2+2=4, so that future lemmas could directly use it if they ever need the fact that 2+2=4. Introduce new naming convention for those theorems in the section that would have no d suffix. Since there are only deduction form theorems it would not be necessary to suffix it anymore for a particular section.

I would like to research if a mass rewrite could be done automatically, I assume some of you have some experience in the necessary tools, in doubt I could do it by hand for each theorem. It shouldn't take that long if you work theorem for theorem (maybe several hundreds of hours)

But that is only if there is some consensus that it is wished. Maybe noone had this proposal but many implicitly wished for the same things as I did. Maybe you have a different workflow that does not have the same issues that I do. I would like to hear about it and have a discussion and bundle it in a RFC of some sorts where we can share opinions.

Another point is that useful theorems are either hidden in mathboxes or just nonexistent. The fact that lemmas are in its most general form without an ergonomic form existing makes several lemmas way harder to prove than it should be. An example is the fundamental theorem of calculus. It is nice that a general theorem exists but this leaks topological results, and it shouldn't. Also it is necessary that you prove that a function is real-differentiable, but either I'm completely stupid and missed how to use real derivative builders (which would be my fault), or that some ergonomic lemmas are missing. I've moved one theorem from Glauco's @glacode (I hope that's him) mathbox which was a godsend, and it still was way too difficult to prove. It should have easy to satisfy hypotheses where you only really need to check that the derivative matches and to eventually calculate the values at the boundaries. The fact that topology and (real/complex) differentiability leaks is not very helpful as everyone who wants to use the fundamental theorem to calculate an integral will have to reprove those statements that should be encapsulated in an easier to use theorem.

It is not understandable to me that there isn't a easy to use version of a theorem of such a fundamental fact that anyone knows and that a simpler version could exist. I love the fact that theorems are proven in its most general form. But a simpler use would be very much appreciated. Even better would be if those would be useful for most of us and not restricted to mathboxes. I get that there are theorems in mathboxes for experimental content. But some things really are so useful that they could be moved directly to main so that other people can use it.

As far as I can tell. @digama0 you have done some converting of theorems of closed to deduction form. I would like to gain that insight how you did it. An example is https://us.metamath.org/mpeuni/isumshft.html this theorem was written in 2007 and you have revised it to use deduction form. I would be interested in continuing your work.

Also I had some issues with tooling, I couldn't get yamma to run and mmj2 was extremely hard to set up (issue with java version and deceiving runtime params), also mmj2 crashes and gives some unexpected NPE when I wanted to search something. I could help to improve the tooling, that is fine. This is also a minor point.

Overall I am quite happy to work on the project. Metamath is fun, it would be even better if some/all of my suggestions would be implemented. I think the database doubled in the last few years, and I have added 1400+ lines which total just a bit over 75KB to my mathbox, which will be much more. I want the project to be scalable with simpler use. Because there are a few issues that new contributors will face as of now. Ancient/arcane tooling, documentationless proofs, difficulty understanding how to formulate proofs. Those improvements are all aimed for mathematicians who all know the theorems but don't know how to formulate them in a formal language. I am also especially thankful to @tirix , you took a lot of time to explain to me some basic concepts. Without that help I could't move forward.

I would love to hear your feedback and I hope that my future contributions will be helpful for future contributors, as I want this project to grow. This is partially motivated by @GinoGiotto ax-13 deprecation, my goal is to have consistent deduction form throughout the whole database.

wlammen commented 5 months ago

One incentive of this database was having short proofs. So if a closed, or immediate form of a theorem has 7 or more applications, then it pays for having it around as a convenience. The number was disclosed to me by Norman Megill, the founder of this project. He would accept a proposal by me, if I could point out at least that many applications. Deduction is not always superior. See for example https://us.metamath.org/mpeuni/syl.html vs https://us.metamath.org/mpeuni/syld.html. Much of your critics is directed towards automation in proof design. There are more, like creating modules for better layering and separation of topics, that were suggested, or even discussed, in the past. We see little to no progress here, I think mostly due to lack of time. One can observe these phenomena in a project like metamath-knife, that still does not replace metamath.exe to full scale.

wlammen commented 5 months ago

I posted it here https://groups.google.com/g/metamath/c/ZndtxZev6C8

metakunt commented 5 months ago

Yeah I've realised that we have built our tooling around a older architechture which doesn't fit modern needs anymore. To your point about syl and its deduction form. I see that syl is needed that much because some of it is used in deduction proofs which have to use closed form proofs as there is no deduction form, one example being faccld. Also it is such a important and simple theorem. I would never seriously suggest removing syl. There is a reason that I explicitly specified (Section 5.2 and Section 5.3) and not the whole database. I just want easier use for theorems which currently are all written in closed form. There are lots of sections that could benefit being in deduction form, mainly section 5 and 6, primarily by adding those deduction versions. I would be more than happy adding deduction versions to those theorems.

We should move to a future where we aren't bound by restrictions of 30+ years of technical debt. My vision is that metamath could have 10+ million theorems, this would be 10+ GB of size. I will propose a system where we can split the database into multiple pieces which corresponds to the current system. However we would likely need to split proofs, theorems, metadata, $v,c,f statements into separate files. We would also need to break up interdependencies of the metamath website and all its subtooling so that the new system can grow. If I had only my mathbox I would have 1400 lines. This is easily scrollable, editable and gittable.

Getting all tooling up to date might be a challenge, but we shouldn't be discouraged by this. I very much regret that historic reasons are holding us so back. Imagine if every codebase was one file only. That would be absolutely horrific. It is convenient to start a project with that, but we are way past that point where we seriously should move forward. And I am not advocating for 1 theorem equals 1 file. That has different problems.

I am interested in offering my time to build those solutions, provided they improve future use. Otherwise I see no point wasting hundreds of hours working on something that wouldn't be merged. But before that there needs to be consensus that this is wished.

I know this is an open source project, it is a passion project for me as well. Right now I am just frustrated how clunky its usage as of now is. I am contributing in spite of it being so awkward.

There are mainly 3 points I am looking forward to work on.

  1. Improve modularity of set.mm and split it in smaller files. Introduce tooling that can work on those individual files. A file could have several imports. Proof verification is done on a file by file basis. It assumes that all previously imported files are proven. Split metadata into different files.
  2. Improve documentation for people who want to start contributing. As a new user I know where the pain points are. It is crazy that there are no good tutorials for non-trivial proof that map mathematical into formal knowledge.
  3. Add deduction form theorems where they are missing. Convert closed form theorems with few or no usages to deduction form theorems.
tirix commented 5 months ago

It should be possible to register to the mailing group without a Google mail account. You can try editing and following this link:

http://groups.google.com/group/metamath/boxsubscribe?email=your_email_here

(replace your_email_here with your actual email address!)

It would be nice if we had a mini-form somewhere on the website with an input for the email, generating this link for subscribing.

tirix commented 5 months ago

Wow, that's a lot of feedback! :)

There are several different items there and it's usually nice to have one issue for one item. Otherwise for "discussion" there is the GitHub "Discussions" feature @david-a-wheeler or @digama0 could consider opening it for the repository, or we just continue using the mailing list (see my post above for subscribing non-Gmail addresses)

The most general answer is that this is a community project and it works with input from the community, so the best way to get some improvement is usually to take from your own time and efforts and push a bit in the direction you want to go (by improving the tools, the database, the documentation, etc.)

Add deduction form variant to many missing lemmas in sections for complex numbers.

More specifically about deduction form of theorems, I think many of us see their advantages, especially for "advanced" topics. There are many deduction form theorem in Glauco's mathbox, and many are also being moved to the main part of set.mm. So "more deduction form theorems" is globally the direction we've been heading to and we can certainly continue in that direction. You can contribute by writing those theorems (manually or automatically).

Deprecate closed form theorems in favour of deduction form theorem over a long period of time.

Ok, so forbidding closed forms is maybe a bit extreme, here a prior discussion is certainly necessary. It would be interesting to run some stats and see where is the "tipping point" from which on many theorems are in deduction form.

Personally I also like deduction form very much, but I've tried to follow a rule of thumb that for new elementary theorems with 3 or less antecedents I consider closed form, and anything beyond that is in deduction form.

I couldn't get yamma to run

Maybe you could post a yamma issue about it? I see glacode/yamma#12 but it seems to be OK and could actually be closed.

split set.mm in smaller files

A split is being discussed at least here already: #29 (starting back in 2016!)

You can already "easily" split set.mm with a metamath.exe command line à la write source set.mm /split, and some of the tools (but probably not all) have the ability to "include" other database parts. So the question is whether we want that to be the "official form" of the database. I'm increasingly getting the "rainbow unicorn" pages recently and actually storing separated files in GitHub might help GitHub to digest the database. Anyway I'd suggest all this to be discussed in #29.

Improve documentation for people who want to start contributing. As a new user I know where the pain points are. It is crazy that there are no good tutorials for non-trivial proof that map mathematical into formal knowledge.

You can certainly help there! One place to write could be the repository's wiki, or a markdown formatted page we can link to in the repository, but anything works, like a Medium article, Youtube or even TikTok videos! David did a few great ones for example (the Youtube videos).

metakunt commented 5 months ago

This is what I get after registering

Hello

We're writing to let you know that the group you tried to contact (metamath) may not exist, or you may not have permission to post messages to the group. A few more details on why you weren't able to post:

If you have questions related to this or any other Google Group, visit the Help Center at https://groups.google.com/support/.

Thanks,

Google Groups

Maybe one of you could add me manually.

metakunt commented 5 months ago

Alright great. Then I can start working on all three points. I have the feeling that the contributions are going to be worth it. I can certainly start my journey adding deduction form theorems. I will also soon use fork a rust verifier and try to implement a modular approach. I will try to get a minimal example going with the new architecture.

I'd still like to know how you did the revisions @digama0 ? Did you use any tools that would help me in my work.

Personally I think a deduction theorem is even worth it with 0 antecedents, see https://us.metamath.org/mpeuni/0red.html vs https://us.metamath.org/mpeuni/0re.html which have about the same usages.

Rust has a prelude https://doc.rust-lang.org/std/prelude/ which automatically imports crates.

"Rust comes with a variety of things in its standard library. However, if you had to manually import every single thing that you used, it would be very verbose. But importing a lot of things that a program never uses isn’t good either. A balance needs to be struck."

I would be building a prelude of useful propositional and predicate logic, set theory and complex numbers. It should contain about 4000-10000 theorems. But that is once I architect the modular approach.

tirix commented 5 months ago

We're writing to let you know that the group you tried to contact (metamath) may not exist, or you may not have permission to post messages to the group.

@david-a-wheeler could you please help here? Do you see @metakunt 's email as registered in the list?

glacode commented 5 months ago

Also I had some issues with tooling, I couldn't get yamma to run and mmj2 was extremely hard to set up (issue with java version and deceiving runtime params), also mmj2 crashes and gives some unexpected NPE when I wanted to search something. I could help to improve the tooling, that is fine. This is also a minor point.

It's been a while since I used the 'current' version of mmj2, but if I recall correctly, there's a known issue with syntax highlighting causing crashes. Disabling syntax highlighting should resolve this problem. The crashes typically occur due to a race condition, especially noticeable when rapidly undoing (ctrl+z) actions.

david-a-wheeler commented 5 months ago

See this page for the basic rules/etiquette of the mailing list, along with a link to join:

https://us.metamath.org/mm-mailing-list.html

We always welcome new members!

david-a-wheeler commented 5 months ago

We're writing to let you know that the group you tried to contact (metamath) may not exist, or you may not have permission to post messages to the group.

@david-a-wheeler could you please help here? Do you see @metakunt 's email as registered in the list?

No, I don't see that email address registered in the Metamath mailing list. I don't understand why it's giving that message. The list is still at: https://groups.google.com/g/metamath/

david-a-wheeler commented 5 months ago

Regarding some of the feedback:

metakunt commented 5 months ago

Hi @david-a-wheeler Thanks for the feedback. 2+2=4 was a bad case, maybe I should have used a different one. As of right now I can't log in on the Mailing List as I don't have a Google account. Is it possible to add my E-Mail manually?

After hearing the feedback I am glad that the community also wants to likes deduction form in some way. There are a lot of sections that could benefit them and I can start immediately.

You are right that 2+2=4 isn't enhanced right now. But it also could benefit from having a deduction form. Here is my one of my theorems where I needed to calculate something https://us.metamath.org/mpeuni/lcmineqlem22.html As it stands 1<2 is perfectly fine, however I immediately slapped a ph onto it, just count the number of a1i and syl usages 8 a1i and 2 syl usages, only because many of those theorems were in inference or closed form and not in deduction form.

This is especially egregious for lcmfnncl, which was clunky to use, as I had to copy paste a lot and find the correct theorems.

I think by adding deduction form to all used theorems I can shave the proof length by 10 steps, namely the a1i and syl usages.

If you compare that with this theorem https://us.metamath.org/mpeuni/lcmineqlem14.html where its closed form variants were annoying me so much that I added a deduction form to it. Apart from the dvdszrcl it is completely in deduction form. Not only that but it was actually a joy to write it. Autocomplete go brrrrr.

tirix commented 5 months ago

No, I don't see that email address registered in the Metamath mailing list. I don't understand why it's giving that message. The list is still at: https://groups.google.com/g/metamath/

@metakunt maybe try simply sending a mail to the subscribe link:

mailto:metamath+subscribe@googlegroups.com?subject=Subscribe

metakunt commented 5 months ago

We received your request to join

In order for us to complete the request, please reply to this email or click below:

I don't know what I would without you @tirix

david-a-wheeler commented 4 months ago

@tirix - great find! I modified the Metamath mailing list front page to show that mailto link (with minor shrouding to counter spammers). I also tweaked mm-mailing-list.html to include the mailto link. Hopefully that will make it easier to for others in the future.

metakunt commented 4 months ago

So I have tried sending a mail to metamath@googlegroups.com to try to use the mailing list, and I still get the Mail delivery error. I might be in the spam folder.

Anyways after talking to @glacode I see how to make the use of closed form theorems easier. By using backwards proof with syl you can obtain a nicer to use version.

Here is my next project. My goal is to have a automated prover for arithmetic in NN0, so that you can calculate 23309217847213 x. 29317378213945932 for example automatically. It uses decmul1c, decmul2c decmac and decma2c to break the multiplication down and then uses several addition theorems to prove the addition results. I think this will come in handy as I will need to have good estimates of certain log values. Multiplication is O(N^2) so for bigger values this might take thousands of steps, but sometimes you need good inequalities to prove results.

Anyways @tirix regarding my first github issue https://github.com/metamath/set.mm/issues/3908 I have since found an ergonomic theorem I'd like to use, which is https://us.metamath.org/mpeuni/2fvidf1od.html I think this is a standard textbook theorem how to show that a function is bijective, and it appears to be easier to satisfy than the implicit substitution variant, at least for me.

tirix commented 4 months ago

Multiplication is O(N^2)

Interestingly, the Karatsuba algorithm has complexity $\mathcal{O}(log_2 3)$, but it's probably harder to implement.

Also, you could check transformation.js of MMJ2, this also includes arithmetic if I remember well.

I have since found an ergonomic theorem I'd like to use, which is 2fvidf1od

Yes, that should work!

tirix commented 4 months ago

I still get the Mail delivery error.

Maybe @david-a-wheeler still needs to approve your message?

metakunt commented 4 months ago

Yeah, I have checked out transformations.js, it uses similar ideas that I also came up with. In particular I want an end to end arithmetic engine, where you just plug in two numbers with an operation and you get a result.

The reason I am actually writing this engine is because I liked mmj2 arithmetic engine feature so much and how much it autocompleted, that I wanted to build something even better.

Also so long as I don't have mailing list access, how would I prove bounds on log, in particular I need to show that for $x \geq 3$ the following

10 < lb^{5}(x)

where lb is the binary logarithm. Calculating it with https://www.wolframalpha.com/input?i=lb%283%29%5E5 shows that this is true, but I would need some sharp bounds on $lb$ and $x^5$ first. Since

lb(x):=\frac{log(x)}{log(2)}

we would likely need some very sharp bounds on $log(2)$. However luckily I don't need the full sharpness bound, it is enough that I show

6 < lb^{5}(x)

which should be way easier since I don't need to calculate it to a ridiculously high degree of accuracy. I am replicating part of the proof of Lemma 4.1 in https://www3.nd.edu/~andyp/notes/AKS.pdf

I am very thankful for suggestions how to tackle this problem. My first idea was to get arbitrary precise upper and lower bounds on $log(2)$, but this might be a red herring.

digama0 commented 4 months ago

(Note: as a mailing list moderator I have not received any message approval requests from @metakunt , like I normally would for a new poster. So something else must be amiss.)

How would I prove bounds on log, in particular I need to show that for $x \geq 3$, $10 < lb^{5}(x)$ holds, where lb is the binary logarithm. Calculating it with https://www.wolframalpha.com/input?i=lb%283%29%5E5 shows that this is true, but I would need some sharp bounds on $lb$ and $x^5$ first.

(BTW I usually see the binary logarithm spelled $\lg$ instead of $lb$.) The natural way I see to prove this is to construct a rational number lying between $10^{1/5}$ and $\lg 3$, resp. $6^{1/5}$ and $\lg 3$. The smallest such fraction is $84/53$ in the first case and $3/2$ in the second, so clearly lowering the precision of the estimate helps here. Given this, the strategy is that if $x\ge 3$, then $6 < (3/2)^5 \le \lg^5 x$ where the first inequality is true because $6\cdot 2^5=192<243=3^5$, and the second inequality is true because $3/2\le \lg x$ because $3\le 2\lg x$ because $2^3\le 3^2\le x^2$. (The lower bound can be improved to $7$ without changing the witness, since $7\cdot 2^5<3^5$ as well.)

metakunt commented 4 months ago

Ah yes. I clicked on the verification link, which didn't appear to work. but in the mail by Google it says you can also reply to this E-Mail, so I did now. According to Wikipedia https://en.wikipedia.org/wiki/Binary_logarithm the binary log is denoted by lb, whereas the common decadic logarithm is denoted by lg https://en.wikipedia.org/wiki/Common_logarithm The usual log is reserved for Euler's number and is either defined as the integral of $\frac{1}{x}$ or as the inverse of $e^{x}$ where applicable. Because Information theory usually deals with the binary logarithm we could define $lb$ by setting

lb(x):= \frac{\log(x)}{\log(2)}

or we could keep it explicit and just use $\log_{2}(x)$ instead. I think consensus is to keep it explicit, such as with $\sinh(x)$

Ah I got an E-Mail from googlegroups: You are now a member of the [metamath] group. I have written a test mail and it appears as I don't have any mail delivery failure. https://groups.google.com/g/metamath/c/7ORqoBAxY9M and it works. So thank you very much for your help.

So now I'd like to know how I can respond to mails. Is the discriminator the subject of the thread. So if I want for example to respond to this thread https://groups.google.com/g/metamath/c/-lWC8KrNg-Y I would have to write an email with the subject "Trying to understand 2p2e4"?

digama0 commented 4 months ago

According to Wikipedia https://en.wikipedia.org/wiki/Binary_logarithm the binary log is denoted by lb, whereas the common decadic logarithm is denoted by lg https://en.wikipedia.org/wiki/Common_logarithm The usual log is reserved for Euler's number and is either defined as the integral of $\frac{1}{x}$ or as the inverse of $e^{x}$ where applicable. Because Information theory usually deals with the binary logarithm we could define $lb$ by setting

lb(x):= \frac{\log(x)}{\log(2)}

or we could keep it explicit and just use $\log_{2}(x)$ instead.

See also https://en.wikipedia.org/wiki/Binary_logarithm#Notation. Apparently ISO suggests to use lb, but this is not in line with mathematical practice in CS and information theory. As one would expect, there is a conflict of domain-specific usage of notations in mathematics, and so there is no way to make everyone happy with a global standard. As for metamath, it uses $\log_2$ via df-logb.

So now I'd like to know how I can respond to mails. Is the discriminator the subject of the thread. So if I want for example to respond to this thread https://groups.google.com/g/metamath/c/-lWC8KrNg-Y I would have to write an email with the subject "Trying to understand 2p2e4"?

The easiest way to respond via email is to receive the message itself and then use the email's reply functionality, which would create an email with the subject `"Re: [Metamath] Trying to understand 2p2e4". I suppose you can also construct that subject line manually, though I've never tried it. If you did not receive the message in your inbox because e.g. you were not signed up when it was written, you can also use the web interface to write messages, although presumably you have to be signed into your account to do so.