google-deepmind / alphageometry

Apache License 2.0
3.81k stars 419 forks source link

Synthetic Data Generation #72

Open lilyj97 opened 5 months ago

lilyj97 commented 5 months ago

I was wondering how the training set of diagrams was generated (the synthetic diagram + accompanying proof). Like what script should I run? graph.py looks promising, but it's evidently not the script they used to generate the 100M synthetic theorem proofs and diagrams. For example in the images below:

Screenshot 2024-02-01 at 5 40 21 PM Screenshot 2024-02-01 at 5 41 36 PM Screenshot 2024-02-01 at 5 42 46 PM

I would greatly appreciate the help!

Ehisnet commented 5 months ago

The owner of the source code for Alpha Geometry are not releasing all the details for this research so we all got to work together to develop the model

2nazero commented 4 months ago

Hi:) Could you please help me find where the code for drawing shapes like you do is located? I'm also searching for any clues that might lead me to it. Thank you very much for your assistance.

ParthaEth commented 3 months ago

I am also struggling to find the script to generate synthetic proofs. Would appreciate it if sb. points me to it.

tpgh24 commented 2 months ago

To further improve the performance of AG, I think we need to collect data based on human designed problems. I made some improvements in a fork repository and have some ideas to improve it, check out AG4Masses and issue 110. In it I have a detailed analysis why I think we need data based on human designed problems, based on many tests I did.