RFC: Extensible pre-definition-elaboration

nomeata commented 10 months ago

Proposal

Context

Recursive function definitions are currently elaborated to a PreDefinition, which is what’s sent to the compiler/interpreter. On the way to the kernel, the PreDefinitions are grouped by strongly connected component, and then pass through Lean.Elab.addPreDefinition which turns them into non-recursive Definition suitable for the kernel.

Status quo

Currently, Lean has a few built-in modes to do this step:

Non-recursive definitions are translated straight-forwardly
unsafe definitions are turned into axioms
partial definitions are made opaque (but don’t require axioms because the result type must be Inhabited)
structurally non-mutual recursive functions are expressed using bounded recursion brec recursor of the underlying data type
well-founded recursive functions are expressed using WellFounded.fix

The code makes educated guesses: Certain annotations cause it to go for well-founded recursion, otherwise it tries structurally first and then well-founded.

This also means that these strategies have a rather simple interface:

(preDefs : Array PreDefinition) : TermElabM Unit

Motivation for more strategies

There are possibly a large number of alternative ways to process a PreDefinition. Some particularly interesting ones are

Partial functions with an explicit inductive predicate for the domain of the function
Monotone functions in a CPO, where the function exists by construction, even for non-terminating equations. In particular
- Monadic functions in the Option monad (and inhabited result type)
- Tail-recursive functions with inhabited result type.
Well-founded recursion using extrinsic “termination” proof using congruence lemmas
Coinductive definitions

It’s easy to tell that I am approaching this from an Isabelle background, where all these are possible. Of course it’s not completely trivial to port them to Lean, but at least with enough Inhabited side conditions and expecting only propositional equalities I believe they can be used here, and will be useful.

There many be many more interesting takes here that people would experiment with, or some very domain-specific ones.

It’s unlikely that we want to have to develop and maintain all of these within the main lean repository. So it seems desirable to make this part of the pipeline user-extensible.

Incidentially, this also reduces friction even for those working on the “core strategies” if bigger changes can be iterated upon in a separate project.

Proposed change (high-level)

Lean is generally a very extensible system. So it beholds of Lean to allow users to write their own PreDefinition processor, without having to touch the core repo.

So this requires

A good name for the concept
An API interface
Surface syntax

Proposed change (detail)

(This is right now mostly a straw man proposal, I don’t have much experience with designing good extensible interfaces in Lean yet).

I need help finding the right name.

Is this a “PreDefinition compiler”? “Derecursifier”? Something else?
The API interface could simply be
```
def Derecursifier :=  Array PreDefinition → TermElabM Unit
```
Maybe with certain expectations (which could be checked), such as that afterwards, certain names have been added to the environment with certain types.
For the surface syntax, I’d expect we can use the macro and elaboration system, define a syntax (syntax category?) for these derecursifiers, and an annotation that goes where terminating_by goes, so derecursify $derecursifier.

Community Feedback

(None yet)

Impact

Add :+1: to issues you consider important. If others benefit from the changes in this proposal being added, please ask them to add :+1: to it.

digama0 commented 10 months ago

Big +1 to the idea. I would suggest not giving it any frontend syntax though, and just using an environment extension to track declared derecursifiers. These are fairly easy to set up and use inside or outside the core. A representative example is:

https://github.com/leanprover/lean4/blob/e217ad3929ed1c6f3dc71d5d47e18458b2357e39/src/Lean/Linter/MissingDocs.lean#L32-L65

You can add new extensions using the @[builtin_foo] attribute inside lean core and @[foo] outside of the core.

Maybe with certain expectations (which could be checked), such as that afterwards, certain names have been added to the environment with certain types.

I would not check these expectations, any more than we would check that a tactic doesn't create invalid metavariables or other invariants. It is sufficient to just put these in the docs for the feature IMO.

There is another aspect of derecursifiers besides generating the definition itself, which is that it needs to prove equations about the definition. Currently this is managed by two environment extensions:

https://github.com/leanprover/lean4/blob/e217ad3929ed1c6f3dc71d5d47e18458b2357e39/src/Lean/Elab/PreDefinition/WF/Eqns.lean#L235-L236

so these are already extensible, but possibly the interface may need revising if the nature of equations returned changes.

nomeata commented 10 months ago

I see, we can use Environment Extensions to declare possible handlers.

How would I annotate a concrete function definitions to use a specific handler? And pass additional parameters (possibly including longer proofs even)?

digama0 commented 10 months ago

How would I annotate a concrete function definitions to use a specific handler? And pass additional parameters (possibly including longer proofs even)?

That would be new syntax, we currently don't have anything like that. I imagine you could have something like a termination_using foo clause after the definition, where foo could either be a new syntactic class or an identifier referencing a Derecursifier object (in which case it need not be globally registered like the structural and WF handlers are).

(The downside of using an identifier is that you can't pass custom parameters parsed by custom syntax, which one would certainly want for the WF extension if it was implemented through this mechanism since it wants the termination_by and decreasing_by clauses.)

nomeata commented 10 months ago

termination_using foo with foo a new syntactic class was my original idea :-)

What are the downsides?

digama0 commented 10 months ago

Adding a new (extensible) syntactic class is a comparatively large extension to the grammar, requiring documentation and so on. It might also cause some new challenges for the formatting / style guide - it's not immediately obvious how it would be formatted to me, and it might also depend on what derecursifiers need in practice, which makes it tricky to spec in the absence of such (except for the two existing derecursifiers, but these would presumably be globally registered and not use the syntax at all).

JamesGallicchio commented 10 months ago

I'd be excited for what this enables in combo with Alex Keizer's impl of QPFs in Lean4, big +1

leanprover / lean4