trishullab / PutnamBench

An evaluation benchmark for undergraduate competition math in Lean4, Isabelle, Coq, and natural language.
41 stars 5 forks source link

Inaccuracies related to Coq in 15 Jul 2024 arXiv paper #190

Open palmskog opened 1 month ago

palmskog commented 1 month ago

As requested on the Coq Zulip, this is a summary of what I consider to be the most important inaccuracies related to Coq in the arXiv paper.

MathComp as a unified library (p. 5)

Coq lacks an expansive unified library such as Mathlib and the HOL Library, which we make extensive use of in Lean 4 and Isabelle respectively.

As can be seen in the package list of the Coq Platform for Coq 8.18, there are 12 packages in the Platform with the prefix coq-mathcomp, which together comprise a "unified" library. The Coq ecosystem focuses on providing small reusable packages using opam, which is why the library is split into many packages. There are also additional MathComp packages hosted in math-comp GitHub organization and the Coq-community GitHub organization. My collaborators and I showed that the "core" MathComp packages/projects have consistent, high-quality naming and coding conventions. You can also see a list of MathComp-related research papers.

Wrong citation for Coqtail (p. 5)

For historical reasons, there are two independent projects both called "Coqtail", one that is a mathematical library for Coq and one that is an interface for Coq using the Vim editor. To tell them apart, we refer to the library as "Coqtail-math". The citation in the paper is to the interface and not the library, when clearly the library is meant.

CoqEAL (p. 14)

The CoqEAL library consists of two parts: (1) formalizations of results in abstract algebra primarily related to matrices using MathComp and (2) a framework for easily changing data representations inside proofs. The framework is primarily applied in computer science applications, such as to prove a fast matrix algorithm implementation on binary natural numbers correct using a previous correctness proof on Peano natural numbers. So CoqEAL is not really about extending the "functionality" of MathComp structures or similar.

Instantiating matrices or groups of real type (p. 14)

One of the maintainers of MathComp Analysis has confirmed on the Coq Zulip there is no issue with instantiating MathComp matrices or groups with realType, the primary MathComp interface to real numbers.

palmskog commented 1 month ago

For the record, it was noted by the author of Tactician that the Coq benchmark in the 15 Jul 2024 arXiv paper using Tactician is inaccurate because no instrumentation on the background math was done.

GeorgeTsoukalas commented 1 month ago

Hi Karl,

Thank you for pointing out these errata - we really appreciate that you spent the time to review our paper. We'll incorporate the changes in a subsequent revision which should appear when the benchmark has moved mostly to rely on mathcomp as a dependency. We are aware of the need to rerun Tactician, which was also relayed to us privately by the author of that work. This will also appear in the subsequent revision, when that experiment is run following any changes in the benchmark. Until that time, I'll leave the issue open. Please let us know if you spot any other issues, thanks!

palmskog commented 1 month ago

Just now, I noticed the following citation issues: