siddhartha-gadgil / LeanAide

Tools based on AI for helping with Lean 4
Apache License 2.0
62 stars 2 forks source link

Evaluation data #1

Closed siddhartha-gadgil closed 1 year ago

siddhartha-gadgil commented 1 year ago

Current status

There are currently three files in this repository that contain together 120 test statements:

There is also a lean program that can run with various configurations to see how many of these elaborate. After the setup following README.md an example run is:

build/bin/bulkelab silly 10 4 false 15 8

This attempts to elaborate all statements in silly-prompts.txt with 10 example prompts based on sentence simliarity, 4 based on keywords with 15 Codex completions for each statement at temperature 0.8.

siddhartha-gadgil commented 1 year ago

Testing performance

Two ways one can test performance:

siddhartha-gadgil commented 1 year ago

Guidelines for expansion

We are working with a partial binary port of mathlib. So all examples should be such that they work with this, and ideally there are related prompts in clean_prompts.json

siddhartha-gadgil commented 1 year ago

Stale, not clear what this means in terms of working now.