Reimplement Gamepad in SerAPI's python Coq API

brando90 commented 5 years ago

github: https://github.com/ml4tp/gamepad arxiv paper: https://arxiv.org/abs/1806.00608 data set: https://drive.google.com/drive/folders/1tdltTB1ng7SGN1JqsuOjFLCcZBdFiPrc

For collaborators of this issue, make sure you read the paper and have done at least the first 2 chapters of Software foundations for Coq: https://softwarefoundations.cis.upenn.edu/lf-current/toc.html

ejgallego commented 5 years ago

We should post here a summary of the API.

brando90 commented 5 years ago

more papers that might be relevant for this section:

TacticToe: Learning to Prove with Tactics https://arxiv.org/pdf/1804.00596.pdf

Proof Construction for Coq by Tactic Learning http://t-news.cn/Floc2018/FLoC2018-pages/proceedings_paper_863.pdf

CoqHammer: http://cl-informatik.uibk.ac.at/cek/coqhammer/

What I really had in mind rather than re-implement all of Gamepad was reimplementing the things that might be relevant for RL algorithms. e.g.

Supervised learning to learn an initial policy (so a pre-training with human proofs). Basically the data set would be S_N = { x_i=(Context,Goal),y_i=human_applied_tactic }^N_{i=1}. This can also be used to pre-train the value function.

brando90 commented 5 years ago

@ejgallego (no hurry for this question) but what has gotten me stuck for a longer time is not having an easy way to talk to Coq (until we met). I wonder what on earth these guys (the last 2 links, not gamepad) used (if its not SerAPI) to produce those ML papers with Coq. You have any guesses?

ejgallego commented 5 years ago

The first paper doesn't use Coq but HOL4, in this case I'd say communication is way easier than Coq but I am not so familiar. As for the second paper, they point out that they need to write a plugin, which in a sense is what SerAPI is. So in some sense they will reimplement SerAPI and however be stuck with the limitations of the plugin API.

SerAPI is not a plugin but a toplevel itself for a large set of technical reasons.

brando90 commented 5 years ago

I can try summarizing here what I thought was the most interesting things about gamepad:

their Definition of ProofState = (Goals,ProofContext)=(G,PG)
creating a supervised data set of S_N = { (X_n =(G_n,PG_n) , Y_n = Tactic_n) }^N_{n=1} so from human Coq scripts we map what the human did at each state of the proof and what the tactic he used as the correct label.
Being able to break down the tactic to any granularity that they wanted (thus creating a rich supervised data set)
Interesting that they were able to run anything with any success on Feit Thompson based on their ridiculously simple toy example (the jump seemed too big to one of my friends here)
the definition of position evaluation as a task to be solved in ML for Theorem Proving seems very interesting (predicting how hard a proof will be basically)
Actions as tactics, state as ProofState

brando90 commented 5 years ago

@ejgallego not necessarily exactly directly related to gamepad (but could be interesting to see if we can use some of these problems ala gamepad to pre-train our agent), but are you familiar or ever heard of TPTP (http://www.tptp.org/)?

The TPTP (Thousands of Problems for Theorem Provers) is a library of test problems for automated theorem proving (ATP) systems.

if you are do you have any thoughts about it? I'd love to know what ppl in the formal methods/coq community think (or how it relates to our problem).

ejgallego commented 5 years ago

AFAICT TPTP is not really geared towards interactive theorem provers but more to automated ones, note ATP vs ITP; using machine learning for ATP is of course a very interesting topic, but I am not sure how far away would it be tho.

brando90 commented 5 years ago

@ejgallego sorry for my ignorance about the topic emilio, but is it very difficult to have ATPs and ITPs meet in the middle? Why are they so different?

ejgallego commented 5 years ago

That's a very good question, and quite hard to answer as the lines are sometimes blurry; a principal difference is that ATPs tend to offer not a lot of support for interaction, for example, such as notation, tactics, etc... ATP are more of the form "press a button, and have things work".

Some stuff in the middle is indeed https://en.wikipedia.org/wiki/Nqthm

brando90 / coq-serapi-python

Reimplement Gamepad in SerAPI's python Coq API #1