openai / miniF2F

Formal to Formal Mathematics Benchmark
309 stars 43 forks source link

Coq support? #66

Open brando90 opened 2 years ago

brando90 commented 2 years ago

Is it planned to have Coq supported too?

tlringer commented 2 years ago

I'm thinking about chipping away at a few of these in Coq later today, if anyone wants to do a push from the Coq community. Would be good to email coq-club.

brando90 commented 2 years ago

I might be interested. What would be a list/plan of what would be needed to do this?

On Feb 6, 2022, at 12:25 PM, Talia Ringer @.***> wrote:

I'm thinking about chipping away at a few of these in Coq later today, if anyone wants to do a push from the Coq community. Would be good to email coq-club.

— Reply to this email directly, view it on GitHub https://github.com/openai/miniF2F/issues/66#issuecomment-1030887335, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOE6LWNUP26BT5W6CNMEWTUZ24K7ANCNFSM5NTHE4GA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you authored the thread.

spolu commented 2 years ago

It would be awesome to have Coq support (even if only partial to start with).

In our experience it takes ~15-20mn to students to formalize a statement and ~5mn to review one. With a small group, covering the 488 statements shouldn't take that long!

brando90 commented 2 years ago

Cool! Once I have more open cycles I will see how to organize formalizing it. Perhaps I can share it in the Coq discuss and have people help me do it! :)

On Feb 8, 2022, at 2:04 AM, Stanislas Polu @.***> wrote:

It would be awesome to have Coq support (even if only partial to start with).

In our experience it takes ~15-20mn to students to formalize a statement. With a small group, covering the 488 statements shouldn't take that long!

— Reply to this email directly, view it on GitHub https://github.com/openai/miniF2F/issues/66#issuecomment-1032318381, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOE6LUHNI4XRIHKUF7HWPLU2DFBTANCNFSM5NTHE4GA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you authored the thread.

brando90 commented 2 years ago

Perhaps we can start discussing it.

Say I translate the statmenets from Lean to Coq. Then I'd need to reprove them in Coq, right? Perhaps it's trivial since lean and coq are both depedently typed? Or how would I proceed? Perhaps I am assuming translating the statement itself is trivial when it's not...?

On Feb 8, 2022, at 9:23 AM, Brando Miranda @.***> wrote:

Cool! Once I have more open cycles I will see how to organize formalizing it. Perhaps I can share it in the Coq discuss and have people help me do it! :)

On Feb 8, 2022, at 2:04 AM, Stanislas Polu @. @.>> wrote:

It would be awesome to have Coq support (even if only partial to start with).

In our experience it takes ~15-20mn to students to formalize a statement. With a small group, covering the 488 statements shouldn't take that long!

— Reply to this email directly, view it on GitHub https://github.com/openai/miniF2F/issues/66#issuecomment-1032318381, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOE6LUHNI4XRIHKUF7HWPLU2DFBTANCNFSM5NTHE4GA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you authored the thread.

tlringer commented 2 years ago

I was slow to get to this but still plan on it. Yes, I think translating the statement should be fairly simple since Lean and Coq are so similar, and the serious ways they differ are unlikely to show up much in these proofs. Proofs in Coq might be similar to Lean or might differ a bit, it depends on what the proof is about and what automation is present in each proof assistant.

If you submit a PR for any of them, feel free to list me and I'll take a look.

brando90 commented 2 years ago

plan to post a discussion here perhaps later? https://coq.zulipchat.com/#narrow/stream/237655-Miscellaneous

tlringer commented 2 years ago

Sure, if you post there I'll join in. I'm also happy to email coq-club to try to organize more of an effort here.

tlringer commented 2 years ago

@spolu, not sure on the permissions, but is the best way to contribute to fork? Or can we have edit access for a Coq branch?

tlringer commented 2 years ago

Forked for now: https://github.com/tlringer/miniF2F/tree/coq

@brando90 happy to develop in my fork; added you. Happy to add any other Coq proof engineers who get started on this, too, once we spread the word.

tlringer commented 2 years ago

So, the theorems are easy to state, but many proofs will be a huge pain in the ass because of Coq's real number library, which is terrible. For example, I took a shot at the first one, and it's maybe the ugliest proof I've ever written: https://github.com/tlringer/miniF2F/commit/3985d9641a7413ba1ccccf9ee8fb82c764a3027d

(@SkySkimmer helped me improve this a lot, since he seems to better understand how lra works: https://github.com/tlringer/miniF2F/commit/4dced501fd10194af60bed86f5527d33525906df)

spolu commented 2 years ago

@tlringer you can fork the repo and create a pull-request 👍

@brando90 most statements in minif2f don't have a proof. I think this is fine. Obviously there is a risk that the statement contains an error (though that risk still exists with a proof).

Anyway statements only are accepted, proofs are welcome.

spolu commented 2 years ago

Let us know how we can help!

tlringer commented 2 years ago

Good to know the theorems are enough. Sounds like any math library like this in Coq would use SSReflect, so I need to figure out whether that would change the theorem statements or just their proofs before continuing.

On Wed, Feb 9, 2022, 1:47 AM Stanislas Polu @.***> wrote:

Let us know how we can help!

— Reply to this email directly, view it on GitHub https://github.com/openai/miniF2F/issues/66#issuecomment-1033441762, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLFD77GBIGHQQZVXKPVB43U2ILYHANCNFSM5NTHE4GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

tlringer commented 2 years ago

Indeed, it will change the theorems too, so getting some SSReflect people on board is a good idea.

On Wed, Feb 9, 2022, 7:32 AM Talia Ringer @.***> wrote:

Good to know the theorems are enough. Sounds like any math library like this in Coq would use SSReflect, so I need to figure out whether that would change the theorem statements or just their proofs before continuing.

On Wed, Feb 9, 2022, 1:47 AM Stanislas Polu @.***> wrote:

Let us know how we can help!

— Reply to this email directly, view it on GitHub https://github.com/openai/miniF2F/issues/66#issuecomment-1033441762, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLFD77GBIGHQQZVXKPVB43U2ILYHANCNFSM5NTHE4GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

ejgallego commented 2 years ago

Hi folks, I'm interested on this too, and in fact I had written to @spolu a couple of days ago to ask him about that possibility. This week is impossible for me, but I propose we take advantage of the breakout sessions in next week's Coq Hackathon ?

https://github.com/coq/coq/wiki/CoqWG-2022-02

tlringer commented 2 years ago

Let's!

BTW Brando, I'm new CS faculty at UIUC also working on proof automation, with significant Coq expertise, and some ongoing machine learning work for Coq. I'm not sure who you are working with, but we should probably talk.

On Wed, Feb 9, 2022, 7:48 AM Emilio Jesús Gallego Arias < @.***> wrote:

Hi folks, I'm interested on this too, and in fact I had written to @spolu https://github.com/spolu a couple of days ago to ask him about that possibility. This week is impossible for me, but I propose we take advantage of the breakout sessions in next week's Coq Hackathon ?

https://github.com/coq/coq/wiki/CoqWG-2022-02

— Reply to this email directly, view it on GitHub https://github.com/openai/miniF2F/issues/66#issuecomment-1033779558, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLFD75BW24FCBNOKUFZWOLU2JWCRANCNFSM5NTHE4GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

ejgallego commented 2 years ago

I'd like to share our idea (our = the Inria Picube team) w.r.t. this, actually we discussed internally about doing a Coq miniF2F version after last year's AITP, however I didn't get to do any action until last week, my fault!

A nice property of Coq's and Lean type theory is that they are essentially the same, with Coq being a bit more general as of today. @skyskimmer did a Lean importer some time ago, as a proof of concept, and it worked!

So I think an interesting way to explore this task is to try to do some automated translation of the Lean dataset to Coq. That in itself seems to me an interesting and cool project which we'd love to work together on.

tlringer commented 2 years ago

Using the automatic translation in a first pass, and then revising by human review sounds like a really fun thing to do, and a great way to evaluate the automatic translation. I'd be happy to participate.

Also, sorry I keep plugging this (I am so happy there is a PhD student at UIUC who is interested in neural automation for Coq), and I'll be embarrassed if you're already in my class, but @brando90, I also have a class on proof automation this semester (likely again next semester): https://dependenttyp.es/classes/598sp2022.html

@spolu is giving a talk in the class later this semester.

Talia

On Wed, Feb 9, 2022, 8:05 AM Emilio Jesús Gallego Arias < @.***> wrote:

I'd like to share our idea (our = the Inria Picube team) w.r.t. this, actually we discussed internally about doing a Coq miniF2F version after last year's AITP, however I didn't get to do any action until last week, my fault!

A nice property of Coq's and Lean type theory is that they are essentially the same, with Coq being a bit more general as of today. @SkySkimmer https://github.com/SkySkimmer did a Lean importer some time ago, as a proof of concept, and it worked!

So I think an interesting way to explore this task is to try to do some automated translation of the Lean dataset to Coq. That in itself seems to me an interesting and cool project which we'd love to work together on.

— Reply to this email directly, view it on GitHub https://github.com/openai/miniF2F/issues/66#issuecomment-1033795480, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLFD753WJD5MW5ZL75K2ITU2JYBJANCNFSM5NTHE4GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

brando90 commented 2 years ago

@tlringer sounds great! Will contact you outside of here. :)

tlringer commented 2 years ago

BTW, @ejgallego, if we take the automatic translation approach, then manually repair specifications, I request that we commit (in the branch) the translated specifications before vetting, then patch them in new commits. This way, the diff can be used as a benchmark for specification repair, too.

spolu commented 2 years ago

Bullish! Assuming we mark the unvetted ones with a comment, or commit them commented.

brando90 commented 2 years ago

curious, is anyone working on this? Or perhaps we can try crowdsourcing it at coq zulip and a few ppl (could try myself too) could help with a few theorems? Then in a few weeks it could done if enough people help? :) Just an idea.

InnovativeInventor commented 2 years ago

I'd like to try my hand at proving a few theorems (manually) in Coq, if nobody else is working on this.

tlringer commented 2 years ago

I lost motivation after one theorem, I think because I also decided to prove the theorem and didn't use MathComp. I'd loop in @ejgallego again, since deciding whether to use MathComp's functions to state the theorems will be important. Would be happy to see you both work on it though. Zulip would probably help a lot.

ejgallego commented 2 years ago

I was expecting to get back to this next week, mainly just to set the Coq environment and port a few examples. Indeed we must use Coq's math-comp analysis library, otherwise it is not feasible to redo all analysis from scratch for this project. I'll be happy to do this bootstrap with more people in a call if you think that'd be useful

Once we bootstrap a few lemmas, how to proceed I am not sure. Examples can be ported by hand, but the statements in Coq / Lean should be similar enough as to also try a simple script to do the conversion automatically.

brando90 commented 2 years ago

I lost motivation after one theorem, I think because I also decided to prove the theorem and didn't use MathComp. I'd loop in @ejgallego again, since deciding whether to use MathComp's functions to state the theorems will be important. Would be happy to see you both work on it though. Zulip would probably help a lot.

ok created it here! https://coq.zulipchat.com/#narrow/stream/237655-Miscellaneous/topic/miniF2F hope that this helps and not split the conversation into two places.

brando90 commented 2 years ago

What is wrong with simply doing the translation of all the theorems by hand and crowdsourcing it (divide and conquer)?

Or the conversation about doing it automatically?

ejgallego commented 2 years ago

What is wrong with simply doing the translation of all the theorems by hand and crowdsourcing it (divide and conquer)?

Nothing is wrong, tho it requires people to learn the analysis library. Also, it is harder to maintain vs an automatic approach.

brando90 commented 2 years ago

Nothing is wrong, tho it requires people to learn the analysis library.

Sorry curious, why do ppl need to learn an analysis library? Like Coq's real analysis library you mean?

ejgallego commented 2 years ago

Indeed to be able to write the statements people need to learn the analysis library, in particular https://github.com/math-comp/analysis/

It is not very hard to learn, at least for the statements I've seen.

ejgallego commented 2 years ago

The library for Reals distributed by Coq is likely too basic to be able to port this, I am not sure what problems Talia found, but myself I don't use it directly as it is just missing too much stuff.

tlringer commented 2 years ago

Emilio, all, I would join such a call if someone organizes it (as long as it is not during a class I teach).

Talia

On Thu, Mar 24, 2022, 11:33 AM Emilio Jesús Gallego Arias < @.***> wrote:

The library for Reals distributed by Coq is likely too basic to be able to port this, I am not sure what problems Talia found, but myself I don't use it directly as it is just missing too much stuff.

— Reply to this email directly, view it on GitHub https://github.com/openai/miniF2F/issues/66#issuecomment-1077815652, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLFD76LZGGHEHFOGWXCRTLVBSKNBANCNFSM5NTHE4GA . You are receiving this because you were mentioned.Message ID: @.***>

spolu commented 2 years ago

Quick update that may be useful: Isabelle has now full coverage on both test and valid thanks to an heroic effort from @albertqjiang and @Wenda302.

In case useful to the Coq effort, I'm sure they'll be able to provide some perspective on the amount of work that was required to get there 🔥

As always, please do let me know if I can help with anything towards getting support for Coq 👍

brando90 commented 2 years ago

I'd like to try my hand at proving a few theorems (manually) in Coq, if nobody else is working on this.

curious @InnovativeInventor did you ever got around to doing this?

InnovativeInventor commented 2 years ago

curious @InnovativeInventor did you ever got around to doing this?

No, not yet. I'm hoping to get around to this after finals, but no promises here.

InnovativeInventor commented 2 years ago

Update: I kinda got nerd-sniped and did a few before I went to bed, just to see how much work it is. The ones that I randomly picked seemed straightfoward (https://github.com/tlringer/miniF2F/pull/2), but I probably picked problems that were too easy. I think most (if not all) competition math problems in this dataset shouldn't require any real analysis, but I could be very wrong here. Perhaps math-comp is overkill?

Big disclaimer: I have never used math-comp.

brando90 commented 2 years ago

May I get an update of where this is at? If there are any formalization may I get access to them? Thanks!

brando90 commented 2 years ago

does someone have a list of potential automatic (e.g. formal ones or even with LLMs) methods to autoformalize these statements? e.g. automatic tool, opts Lean -> Coq, Isabelle -> Coq, HOlight -> Coq, metamath -> Coq?

brando90 commented 2 years ago

automatic tool, opts Lean -> Coq

curious in particular about that one.

InnovativeInventor commented 2 years ago

May I get an update of where this is at? If there are any formalization may I get access to them? Thanks!

My most up-to-date progress is on this PR: https://github.com/tlringer/miniF2F/pull/2. Haven't worked on this lately, though. I don't know if others are working on this atm.