expln / metamath-lamp

Metamath-lamp (Lite Assistant for Metamath Proofs) is a GUI-based proof assistant for creating formal mathematical proofs in Metamath that does not require installation (just run it directly using your web browser).
https://expln.github.io/lamp/latest/index.html
MIT License
11 stars 4 forks source link

Unify hangs for certain statements/theorems #175

Closed BTernaryTau closed 7 months ago

BTernaryTau commented 7 months ago

I'm not sure what exactly causes this behavior, but for the below state attempting to unify all results in mm-lamp hanging at 66%.

{"srcs":[{"typ":"Web","fileName":"","url":"https://us.metamath.org/metamath/set.mm","readInstr":"ReadAll","label":"","resetNestingLevel":true,"allLabels":[]}],"descr":"","varsText":"","disjText":"","stmts":[{"label":"edgnelspthcyc.1","typ":"e","isGoal":false,"cont":"|- V = ( Vtx ` G )","jstfText":""},{"label":"edgnelspthcyc.2","typ":"e","isGoal":false,"cont":"|- I = ( iEdg ` G )","jstfText":""},{"label":"edgnelspthcyc.3","typ":"e","isGoal":false,"cont":"|- ( ph -> G e. W )","jstfText":""},{"label":"edgnelspthcyc.4","typ":"e","isGoal":false,"cont":"|- ( ph -> Fun I )","jstfText":""},{"label":"edgnelspthcyc.5","typ":"e","isGoal":false,"cont":"|- ( ph -> A =/= B )","jstfText":""},{"label":"edgnelspthcyc.6","typ":"e","isGoal":false,"cont":"|- ( ph -> E. f E. p ( f ( A ( SPathsOn ` G ) B ) p /\\ -. J e. ran f ) )","jstfText":""},{"label":"edgnelspthcyc.7","typ":"e","isGoal":false,"cont":"|- ( ph -> <. J , E >. e. I )","jstfText":""},{"label":"edgnelspthcyc.8","typ":"e","isGoal":false,"cont":"|- ( ph -> J e. X )","jstfText":""},{"label":"edgnelspthcyc.9","typ":"e","isGoal":false,"cont":"|- ( ph -> E e. ~P V )","jstfText":""},{"label":"edgnelspthcyc.10","typ":"e","isGoal":false,"cont":"|- ( ph -> A e. E )","jstfText":""},{"label":"edgnelspthcyc.11","typ":"e","isGoal":false,"cont":"|- ( ph -> B e. E )","jstfText":""},{"label":"1","typ":"p","isGoal":false,"cont":"|- ( ( ( G e. W /\\ Fun I /\\ A =/= B ) /\\ ( E. f E. p ( f ( A ( SPathsOn ` G ) B ) p /\\ -. J e. ran f ) /\\ <. J , E >. e. I /\\ J e. X ) /\\ ( E e. ~P V /\\ A e. E /\\ B e. E ) ) -> E. h E. q ( h ( Cycles ` G ) q /\\ h =/= (/) ) )","jstfText":""},{"label":"2","typ":"p","isGoal":false,"cont":"|- ( ph -> ( G e. W /\\ Fun I /\\ A =/= B ) )","jstfText":"edgnelspthcyc.3 edgnelspthcyc.4 edgnelspthcyc.5 : 3jca"},{"label":"3","typ":"p","isGoal":false,"cont":"|- ( ph -> ( E. f E. p ( f ( A ( SPathsOn ` G ) B ) p /\\ -. J e. ran f ) /\\ <. J , E >. e. I /\\ J e. X ) )","jstfText":"edgnelspthcyc.6 edgnelspthcyc.7 edgnelspthcyc.8 : 3jca"},{"label":"4","typ":"p","isGoal":false,"cont":"|- ( ph -> ( E e. ~P V /\\ A e. E /\\ B e. E ) )","jstfText":"edgnelspthcyc.9 edgnelspthcyc.10 edgnelspthcyc.11 : 3jca"},{"label":"5","typ":"p","isGoal":false,"cont":"|- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\\ h =/= (/) ) )","jstfText":""},{"label":"edgnelspthcyc","typ":"p","isGoal":true,"cont":"|- ( ph -> -. G e. AcyclicGraph )","jstfText":""}]}
expln commented 7 months ago

This behavior is caused by large number of combinations of how certain assertions could be applied to the available statements. This happens for the 5th statement and for some assertions like syl333anc. For example, for syl333anc there are 864,813,056 combinations. In that case only one combination of arguments is valid. Mm-lamp is not smart enough to see this, so it tries to check all of them, but this will take a lot of time.

I implemented a new feature: to check only first N combinations. It is already available on dev. The number N is by default 10,000. It is configurable on the Settings tab under the setting "Max number of combinations". If for some statement number of combinations to check is more than 10,000 then mm-lamp proceeds to other assertions and statements, but, at the end, there will be a message under the statement indicating that not all combinations were checked.

I recorded a video how to solve the case from the description of this issue: https://drive.google.com/file/d/1-qeZiMRtIBB74M7BjWc8Wqu6y9vr7-dK/view?usp=sharing

Here is an explanation of what's going on in the video. First, I "Unify all". It hangs a bit on 66% (this is the statement 5) but finishes soon. It shows that the statement 5 is not proved but not all possibilities were examined. If at this moment I didn't have any idea how to prove this step I could do nothing unfortunately (probably, this will be true for beginners who encounter this error messages). But I knew that this step could be proved by one of the assertions like sylNNNanc. So I clicked the question mark to the left of the 5th step. This opened the bottom-up prover dialog with some settings set to emulate "Unify all" ("Search depth" = 1 is important one). "Allowed statement, first level" is set to "All", that's what causes the issue with hanging. So I need to reduce the amount of statements available for the prover. Since I know that the step 1 is the key in this proof, I select only it. I also select "Allow new steps", this will allow the prover to check the below combinations (and other similar ones):

[1 ? ? ? ? ? ? ? ? ? : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )
[? 1 ? ? ? ? ? ? ? ? : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )
[? ? 1 ? ? ? ? ? ? ? : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )
[? ? ? 1 ? ? ? ? ? ? : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )
[? ? ? ? 1 ? ? ? ? ? : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )
[? ? ? ? ? 1 ? ? ? ? : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )
[? ? ? ? ? ? 1 ? ? ? : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )
[? ? ? ? ? ? ? 1 ? ? : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )
[? ? ? ? ? ? ? ? 1 ? : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )
[? ? ? ? ? ? ? ? ? 1 : syl333anc ]  =>  |- ( ph -> E. h E. q ( h ( Cycles ` G ) q /\ h =/= (/) ) )

The question marks here represent "new steps" which the prover has to derive based on the available information. In that case it should not be difficult, because the result statement and the step 1 fully define the substitution and the last row in the list of the combinations above produces a useful result. Here is a tricky moment. In the result of the bottom-up prover we see that the below was found:

5 [edgnelspthcyc.3 edgnelspthcyc.4 edgnelspthcyc.5 edgnelspthcyc.6 edgnelspthcyc.7 edgnelspthcyc.8 edgnelspthcyc.9 edgnelspthcyc.10 edgnelspthcyc.11 1 : syl333anc ]

But only the step 1 was allowed to participate in the proof search. How did the hypotheses appear there? The answer is that during the proof stage, those were really new statements derived by the prover. But at the moment of showing results, mm-lamp compared those new statement with all the existing ones and found out that there are such existing statements and the user already gave them some names. So mm-lamp doesn't show them as new statements, and also it uses existing names instead of new names. This explains importance of selecting "Allow new steps".

So the general idea of what to do in similar situations: decrease the amount of statements participating in the proof (select only key ones) and select "Allow new steps" for the remaining statements to be derived by the prover. Unfortunately, this will not resolve all such cases, but at the moment I don't have other ideas.

There is one more unintuitive moment good to be aware of. If you begin this example from the scratch, i.e. import the state from the description of this issue, click "Unify all", click the question mark to the left of the 5th step, then the bottom-up dialog opens. If you don't change anything in the dialog and click the "Prove" button, you'll see two results are found. There are two moments. First of all, those results were not found during "Unify all" because all of them depend on an unproved step - the step 1. "Unify all" never introduces justifications containing unproved steps. The second moment is that during the bottom-up proof, the new feature of using only first 10,000 combinations per assertion also applies. This means the shown results are incomplete (there is no indication of the incompleteness of results in the bottom-up prover, but I will add it).

expln commented 7 months ago

This bug is fixed in v21.