coq / coq

Coq is a formal proof management system. It provides a formal language to write mathematical definitions, executable algorithms and theorems together with an environment for semi-interactive development of machine-checked proofs.
https://coq.inria.fr/
GNU Lesser General Public License v2.1
4.86k stars 650 forks source link

Regression proving with Coq: past, present, and future #9262

Open palmskog opened 5 years ago

palmskog commented 5 years ago

In ongoing work, we are investigating regression proving techniques for Coq. Regression proving is concerned with re-checking affected proofs after a change has been made to a verification project.

Our main goal is to increase the productivity of proof engineers in large scale projects, where a short "edit-check cycle" is critical. Inspiration comes primarily from regression testing research in software engineering: we consider Coq proofs analogous to program tests. In particular, we assume engineers make generous use of CI and other automation as in software projects.

Here is an attempt at a summary of the current state of our work and ideas for future improvements. Many of these improvements require changes in Coq itself.

Background

Our starting point is the document-oriented model and underlying proof processing architecture introduced in Coq 8.5. Specifically, the .vio file format introduced in that version supported two key features for the batch processing toolchain:

  1. asynchronous checking of a single opaque proof in a file (e.g., coqc -check-vio-tasks 1 A)
  2. fine-grained proof checking parallelism (e.g., coqc -schedule-vio-checking 3 A.vio B.vio)

These two features mirror the possibilities in software batch test execution, where individual tests can be run in isolation, and where test execution can be parallelized both at the test method and test class level.

iCoq: sequential regression proof selection

Regression test selection techniques for Java-like languages track dependencies among classes and methods and use this information to perform change impact analysis when a project is modified. Regression test selection tools are typically run in CI and persist dependency metadata between runs.

iCoq is our analogue of Java regression test selection tools for Coq and builds on asynchronous checking of opaque proofs. It analyzes dependencies between definitions and lemmas, and finds and checks (only) impacted proofs. It can be run either locally or in CI.

iCoq implementation highlights:

See an example of how iCoq works.

Highlights from empirical study on version histories of large projects:

Main limitations of iCoq:

Possible improvements:

piCoq: parallel regression proving

iCoq performed only sequential checking. piCoq extends iCoq to parallel checking and includes classical modes for incremental, file-level parallel checking (similar to coq_makefile). We included more histories of large projects for comparison.

Implementation highlights:

No selection File selection Proof selection
File parallelism
  • - [x] .vo
  • - [x] .vo
  • - [ ] .vo
Proof parallelism
  • - [x] .vio
  • - [x] .vio
  • - [x] .vio

Highlights of empirical study on version histories of large projects (including Verdi, Flocq, and Coquelicot):

Possible improvements:

Suggestions towards future regression proving support

palmskog commented 5 years ago

@gares @charguer let me know what you think and if you have questions or additions. Thus far, I feel our work has mostly been about exploring feasibility. But with help from Coq developers, we believe regression proving tools could become robust enough to be adopted by large projects, especially in their CI.

We will also try to prepare specific API suggestions that can be discussed at CoqPL.

cc: @ejgallego