contaminated test set - Githubissues

Hi @zijunchen68 , thank you for pointing this out! The miniF2F dataset LEGO uses can be found in this repository: https://github.com/albertqjiang/miniF2F. This repository is a fork from the original OpenAI miniF2F dataset. In this fork, there are approximately 10 to 15 problems that differ (the exact number escapes my memory), and I'm unsure of the reasons behind these changes (I should ask Albert for clarification). We've been utilizing this fork to conduct experiments with LEGO-Prover since its early stages, and only recently discovered discrepancies between some problems in this fork and the original OpenAI miniF2F dataset. However, LEGO-Prover fails to prove most of these differing problems, suggesting that reverting them to the original dataset may not significantly impact performance.

wiio12 / LEGO-Prover

contaminated test set #3