google-deepmind / alphageometry

Apache License 2.0
4.07k stars 459 forks source link

doubt the effectiveness of the model #73

Open songge25 opened 7 months ago

songge25 commented 7 months ago

This is a Chinese high school math problem image

Let me rewrite the problem as follows: a b c = triangle a b c; a d e = triangle a d e; e = on_tline e a a b, eqdistance e a a b; b = on_tline b c a b; d = eqdistance d e a c ? perp e d a c

The model will stay stuck with no results. Did I write something wrong, or the model can't handle this kind of math?I doubt it

unhandyandy commented 7 months ago

I think the "p" in pline is for parallel, not perpendicular; try tline instead.

a b c = triangle a b c doesn't make it a right triangle. I think you're supposed to imagine constructing the diagram one point at a time, so you can't impose a condition on a point after introducing it. But I may be wrong about that.

songge25 commented 7 months ago

I agree with you that build at a time is needed and I will try it.

Can you understand the meaning of the second line of def.txt

unhandyandy commented 7 months ago

As I said in #67, I think the colon separates outputs on the left from inputs on the right.

unhandyandy commented 7 months ago

Another issue is that if c happens to be below b,then the proposition fails. Maybe you should try starting with a rtrapezoid, so you can control the orientation of the points.

Ehisnet commented 7 months ago

I had a similar issue when trying to solve other geometry problems aside the ones provided by the owner of the Alpha geometry. Mine was USAMO 2023 P1. It was running fine for a while then later got stuck. I will be glad if we can collaborate in developing the right syntax or Alpha geometry programming languages ( just like Wolfram Alpha , Mat Lab and the rests) used by Alpha geometry in converting English language geometry problems otherwise, the whole essence of the project is a waste of time and energy.

Here is the syntax I generated for USAMO 2023 p1 with the help of Chat GPT: The original English Problem from USAMO website

image

The translated Alpha Geometry problem using Chat GPT usamo_2023_p1 a b c = triangle a b c; o = midpoint o b c; p = foot p c a b; d = on_tline d a a o; e = on_circle e d b; q = on_tline q b b c; f = on_circle f d a, on_circle f e b; n = midpoint n a q ? cong o n o c

I believe if Alpha Geometry can solve IMO problems successfully, then Olympiad geometry problems of lower difficulty compared to IMO should not be a challenge at all.

You can also try running this code in your project with the necessary privileges to be sure if the problem is not from my machine.

Ehisnet commented 7 months ago

The strategy I used in developing this translated versions was that I fed Chat GPT with a lot of training data from the translated IMO_ag_30.txt and the examples.txt until it was able to produce an output more like the ones used by Alpha Geometry that ran without any errors but as i said it was running successfully but got stucked.

It is either there are still some hidden issues with the translated problems as we do not know to what extent it agrees with the way other problems have been translated, or my machine can not handle such computation I doubt this case or we need to adjust the beam size, depth etc to match the problem specifications.

But am of the opinion that we need to pass all the Alpha Geometry tests and reproduce the results that was obtained by the owner of Alpha Geometry before we can concretely make any conclusions.

unhandyandy commented 7 months ago

@Ehisnet This seems to work:

usamo_2023_p1
a b c = triangle a b c; m = midpoint m b c; p = foot p c a m; d = circumcenter d a p b; q = on_circle q d b, on_line q b c; n = midpoint n a q ? cong n b n c

I was worried that there doesn't seem to be a way to specify that q not= b, but maybe AG defaults to forming new points.

unhandyandy commented 7 months ago

@songge25 Yes, I'm having trouble with that problem too. Here's my translation:

e a b c = r_trapezoid e a b c; d = on_line d a b, eqdistance d e a c ? perp e d a c

I ran it for 2 hours with a GPU and slightly increased parameters:

BATCH_SIZE=2
BEAM_SIZE=8
DEPTH=4

It timed out without finding a solution.

It occurs to me that the problem itself is almost trivial: just apply the congruence principle SSA, which is valid when A is right or obtuse. Maybe the problem is that AG can't deal with a rule that applies only to certain angles?

unhandyandy commented 7 months ago

So the code is getting stuck in the while loop beginning at line 500 of graph.py. It's trying unsuccessfully to satisfy the goal perp. Not knowing exactly the architecture of the system, I'm guessing it can't find any possible path from the assumptions to the goal.

unhandyandy commented 7 months ago

OMG, I forgot to constrain two sides of r_trapezoid to be congruent. But I'm not sure how to do that without losing the orientation...

Geomathart commented 7 months ago

Can you make the angle constraint with r_triangle and then the two sides constraint with eq_trapezoid?

unhandyandy commented 7 months ago

@Geomathart But in an eq_trapezoid, the two congruent sides are opposite each other. In the OP problem, they're adjacent.

It seems that for this problem you need to be able to constrain the points E and C to be on the same side of AB. Not clear to me that's possible in AG.

Geomathart commented 7 months ago

I made it solve the problem, after i gave him the first step/altering the problem. Instead of ED=AC i gave him AD = BC a b = segment a b; c = on_tline c b a b; d = on_line d a b, eqdistance d a b c; e = on_tline e a a b, eqdistance e a b a ? perp d e c a

(on_line and eqdistance seem not to perform well together if one distance is not on the line, thats why i had to slightly change the problem to make it work. -> after tweaking the input it worked, 3 comments after this one)

EDIT: Fixed a typo in my constructioncommand

Geomathart commented 7 months ago

@Ehisnet The chat gpt implementation of the USAMO 2023 P1 is not correct. (P is not normal on AM,...)

a b c = triangle a b c; d = midpoint d b c; e = foot e c a d; o = circle a b e; f = on_circle f o b, on_line b c; n = midpoint n a f ? cong n b n c worked, but i did not yet confirm the proof.

unhandyandy commented 7 months ago

@Geomathart Nice going! Why don't you add your translation to #67 on language documentation.

unhandyandy commented 7 months ago

I'm a little worried about this a b = segment a b; c = on_tline c b a b; d = on_line d a b, eqdistance d a b c; e = on_tline e a a b, eqdistance e a b a ? perp d e c a There doesn't seem to be anything to constrain d to be between a and b. Or to constrain e to be on the same side of ab as c.

Geomathart commented 7 months ago

@unhandyandy yeah, i thought about that as well, but it did not find a way to contrain them to one side. Interestingly after many tries AG always found a solution because if d is not between a and b it flips e automatically to the other side to get the desired solution.

unhandyandy commented 7 months ago

@Geomathart Do you mean many runs, or many tried on one run?

Geomathart commented 7 months ago

AG finds the solution instantly in one run. I made over 10 runs and it always found the solution. Randomly making D between A and B, without me changing the statement.

EDIT: I found a way to constrain it to one side, i forgot the s_angle command... Therefore it is always constructed like the statement and it even works with the original statement DE =AC !

a b = segment; c = s_angle a b c -90; e = s_angle b a e 90, eqdistance e a b a; d = on_line d a b, eqdistance d e a c ? perp d e c a

unhandyandy commented 7 months ago

So what did you mean by "after many tries AG always found a solution"? edit: Oh, you meant that it always found a solution on the many runs you tried, got it.

I hadn't even noticed the s_angle command, thanks!

Geomathart commented 7 months ago

I unintentionally altered the meaning of the sentence by using "after" as a filler word. I meant i made many runs and AG came in each run to a solution.

unhandyandy commented 7 months ago

I'm looking at your suggestion. a b = segment; c = s_angle a b c -90; e = s_angle b a e 90, eqdistance e a b a; d = on_line d a b, eqdistance d e a c ? perp d e c a Another ambiguity is that d does not seem constrained to lie on the same side of ae as b.

Ehisnet commented 7 months ago

@Ehisnet This seems to work:

usamo_2023_p1
a b c = triangle a b c; m = midpoint m b c; p = foot p c a m; d = circumcenter d a p b; q = on_circle q d b, on_line q b c; n = midpoint n a q ? cong n b n c

I was worried that there doesn't seem to be a way to specify that q not= b, but maybe AG defaults to forming new points.

image

Thanks so much this work. So what was wrong with the translation and can we develop the alphageometry programming language so we can learn how to code the english language problems into alphageometry high level language or syntax

unhandyandy commented 7 months ago

@Ehisnet As I think Geomathart pointed out above, the command p = foot p c a b should be p = foot p c a o I haven't checked if that's the only problem.

Ehisnet commented 7 months ago

@Ehisnet The chat gpt implementation of the USAMO 2023 P1 is not correct. (P is not normal on AM,...)

a b c = triangle a b c; d = midpoint d b c; e = foot e c a d; o = circle a b e; f = on_circle f o b, on_line b c; n = midpoint n a f ? cong n b n c worked, but i did not yet confirm the proof.

Yes you are correct, please can you provide lists of all the syntax and rules used in this translations. thanks

Ehisnet commented 7 months ago

@Geomathart @unhandyandy

image

Hello guys can we translate this problem above

unhandyandy commented 7 months ago

@Ehisnet The chat gpt implementation of the USAMO 2023 P1 is not correct. (P is not normal on AM,...) a b c = triangle a b c; d = midpoint d b c; e = foot e c a d; o = circle a b e; f = on_circle f o b, on_line b c; n = midpoint n a f ? cong n b n c worked, but i did not yet confirm the proof.

Yes you are correct, please can you provide lists of all the syntax and rules used in this translations. thanks

If I knew of such a list I'd be happy to share it. I suggest we add what we figure out to #67 .

unhandyandy commented 7 months ago

@Geomathart @unhandyandy image

Hello guys can we translate this problem above

I doubt it. :) It's way too early for something like that. Where did you find it?

Ehisnet commented 7 months ago

@Geomathart @unhandyandy image Hello guys can we translate this problem above

I doubt it. :) It's way too earlier for something like that. Where did you find it?

I found it on a contest website, am of the opinion we compile all the rules and syntax we have found so far and add to #67 just as you have suggested.

Ehisnet commented 7 months ago

@Geomathart @unhandyandy image Hello guys can we translate this problem above

I doubt it. :) It's way too earlier for something like that. Where did you find it?

I found it on a contest website, am of the opinion we compile all the rules and syntax we have found so far and add to #67 just as you have suggested.

I was trying to train chatgpt into producing the syntax used in the translation but we can always correct it from our experience and with that we can produce the documentations. what do you guys think?

Ehisnet commented 7 months ago

@unhandyandy @Geomathart Here was the translation of the problem using chatgpt can we make our inputs

ortho_triangle a b c = triangle a b c d e f = feet a b c u = on_circumcircle a d ? intersection d f o = circumcenter a b c x = on_tline a o ? intersection e f e' = reflection e d x f' = on_tline f d x ? parallel d f x p = on_tline a e' ? intersection d u x ? distance d u = distance p d + distance p x q = on_tline a f' ? intersection d u x ? distance d u = distance q d + distance q x result = coll e f ? coll e' q ? coll f' p

Ehisnet commented 7 months ago

ALPHAGEOMETRY_IMO_30.pdf Here is a sample of documents for the syntax

unhandyandy commented 7 months ago

ALPHAGEOMETRY_IMO_30.pdf Here is a sample of documents for the syntax

Maybe you we could make a google sheet that's easy for everyone to contribute to.

Ehisnet commented 7 months ago

okay i will do just that. thank you

On Sat, Feb 17, 2024, 3:56 PM unhandyandy @.***> wrote:

ALPHAGEOMETRY_IMO_30.pdf https://github.com/google-deepmind/alphageometry/files/14319077/ALPHAGEOMETRY_IMO_30.pdf Here is a sample of documents for the syntax

Maybe you we could make a google sheet that's easy for everyone to contribute to.

— Reply to this email directly, view it on GitHub https://github.com/google-deepmind/alphageometry/issues/73#issuecomment-1950242208, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFAKDKKIHGQBIHAD752ITRLYUDHJ3AVCNFSM6AAAAABCZKKP62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJQGI2DEMRQHA . You are receiving this because you were mentioned.Message ID: @.***>

Ehisnet commented 7 months ago

https://docs.google.com/document/d/1K4QspqnGFCJ9hpFyPUvaJXE7SAWbt3XRBBtyXzGhsOo/edit?usp=sharing

@unhandyandy @Geomathart @songge25 @yotam Hi guys let's make our contributions to the google docs above on Alpha Geometry Language Documentations.

Geomathart commented 7 months ago

(The link does not work.) EDIT: the Link does work, just the redirection from github does not. In the paper are most of the constructions documented, in the def.txt some more and the syntax. But yeah, we need an accessable trained gpt model to translate text to the syntax. @Ehisnet except for the last Points P and Q, your new problem can be made with AG. But it should be possible to train AG on this problem by adding ellpses, to the definitions and a training set with added ellipses.

@unhandyandy about the ambiguity of my last solution: Because there is no implemented constrain for point sequence on a line, AG looks for a solution on both sides, which satisfies the other constrains and of course only finds the solution between A and B for D, to get the desired proof. EDIT: You can constrain it with d = s_angle e a d -90, eqdistance d e a c, but then it will not find the solution, (with eqdistance d a a c you get one).

songge25 commented 6 months ago

image

a b c = triangle a b c; d = on_line d a b; e = on_line e b d, eqdistance e a d b; f = on_pline f e c b, eqdistance f e b c ? cong a c d f

==========================

what the ∠(BC-EF) = ∠(BF-CE) mean?

Ehisnet commented 6 months ago

(The link does not work.) EDIT: the Link does work, just the redirection from github does not. In the paper are most of the constructions documented, in the def.txt some more and the syntax. But yeah, we need an accessable trained gpt model to translate text to the syntax. @Ehisnet except for the last Points P and Q, your new problem can be made with AG. But it should be possible to train AG on this problem by adding ellpses, to the definitions and a training set with added ellipses.

@unhandyandy about the ambiguity of my last solution: Because there is no implemented constrain for point sequence on a line, AG looks for a solution on both sides, which satisfies the other constrains and of course only finds the solution between A and B for D, to get the desired proof. EDIT: You can constrain it with d = s_angle e a d -90, eqdistance d e a c, but then it will not find the solution, (with eqdistance d a a c you get one).

Hello @unhandyandy I have fixed the problem, the link should work now

Ehisnet commented 6 months ago

In the proof steps you provided, the expression ∠(BC-EF) = ∠(BF-CE) represents an equality of angles. Let's break it down:

1.

∠(BC-EF): This denotes the angle formed between the lines BC and EF. It could be an angle at point B or C, depending on the context. 2.

∠(BF-CE): This denotes the angle formed between the lines BF and CE. Similarly, it could be an angle at point B or F, depending on the context.

The statement ∠(BC-EF) = ∠(BF-CE) asserts that these two angles are equal. In geometry, when two angles have the same measure, they are considered equal. This equality is often used in geometric proofs to establish relationships between different parts of a diagram.

In the specific context of your proof, this equality is one of the steps used to show the congruence of certain triangles using the Side-Angle-Side (SAS) criterion, leading to the conclusion FD = CA.

On Fri, Feb 23, 2024 at 3:33 AM songge25 @.***> wrote:

image.png (view on web) https://github.com/google-deepmind/alphageometry/assets/37694761/4adb079e-cb41-46c1-83e8-f0fa7a53c8eb a b c = triangle a b c; d = on_line d a b; e = on_line e b d, eqdistance e a d b; f = on_pline f e c b, eqdistance f e b c ? cong a c d f

-

From theorem premises: A B C D E F : Points B,D,A are collinear [00] EA = DB [01] B,E,D are collinear [02] FE = CB [03] FE ∥ CB [04]

Auxiliary Constructions: : Points

Proof steps:

  1. BC ∥ EF [04] ⇒ ∠BCF = ∠EFC [05]
  2. FE = CB [03] & ∠BCF = ∠EFC [05] (SAS)⇒ FB = CE [06]
  3. FE = CB [03] & ∠BCF = ∠EFC [05] (SAS)⇒ ∠(BC-EF) = ∠(BF-CE) [07]
  4. B,D,A are collinear [00] & B,E,D are collinear [02] & ∠(BC-EF) = ∠(BF-CE) [07] & EF ∥ BC [04] ⇒ ∠DBF = ∠AEC [08]
  5. EA = DB [01] & FB = CE [06] & ∠DBF = ∠AEC [08] (SAS)⇒ FD = CA

what the ∠(BC-EF) = ∠(BF-CE) mean?

— Reply to this email directly, view it on GitHub https://github.com/google-deepmind/alphageometry/issues/73#issuecomment-1960687488, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFAKDKIXYMN4NSCIVJNAQVDYVAEX5AVCNFSM6AAAAABCZKKP62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRQGY4DONBYHA . You are receiving this because you were mentioned.Message ID: @.***>

songge25 commented 6 months ago

but the EF and BC all parallel,there is no angle,the ∠(BC-EF) may be not the means

songge25 commented 6 months ago

image

Ehisnet commented 6 months ago

I see

On Fri, Feb 23, 2024, 6:27 AM songge25 @.***> wrote:

image.png (view on web) https://github.com/google-deepmind/alphageometry/assets/37694761/a3d4ac1f-0178-4be8-afed-bb0dc302d1b5

— Reply to this email directly, view it on GitHub https://github.com/google-deepmind/alphageometry/issues/73#issuecomment-1960795165, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFAKDKOQPPV5R6MWJ6F6KU3YVAZDPAVCNFSM6AAAAABCZKKP62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRQG44TKMJWGU . You are receiving this because you were mentioned.Message ID: @.***>

songge25 commented 6 months ago

somebody know it?

tpgh24 commented 4 months ago

The current version of AG indeed has many limitations, it needs a tremendous amount of hardware to achieve IMO level ability. Also the code is not very robust. If your run is stuck, chances are the problem syntax is wrong.

I made some improvements in a fork repository and have some ideas to improve it, check out AG4Masses and issue 110. I also share many of my experiences testing AG, including a detailed explanation of the problem definition language.