Closed leodemoura closed 8 years ago
This is an excellent summary, and extremely helpful.
It seems to me that singleton elimination lies at the root of the problem. The idea behind Prop
is to separate data we care about from things that are computationally irrelevant, so we can ignore the irrelevant parts when computing. But when terms contain eq.rec
or acc.rec
, things that should be computationally irrelevant become relevant, because the computation can't continue without computing them.
Is that a reasonable description of the issue? It still makes me uncomfortable to invest a lot of effort to compute things that are computationally irrelevant. My gut feeling is that if there is any value to working in a world with Prop
, it is that it helps us to be clear about what we want to compute and what we don't.
Maybe an alternative strategy is to give up on fast computation in Lean for terms that use eq.rec
. We can still use dependent type theory to reason about non-dependent algorithms, and even reason about algorithms that use eq.rec
. But to run the latter efficiently in Lean, we would have to reflect / translate / compile them down to algorithms that do not "use" eq.rec
in the computation.
Is that a reasonable description of the issue?
Yes, I fully agree.
Maybe an alternative strategy is to give up on fast computation in Lean for terms that use
eq.rec
I think both eq.rec
and acc.rec
are extremely useful.
We would have to give up fast computation for any term using "casts" and/or "well-founded recursion" :-(
I know. Casts come up, for example, in the dependent coding of vec A n
. But if we encode vec A n
as lists with length n
, it seems to me that we can design our algorithms in ways that avoid casts.
Similarly, there are ways defining well-founded recursions that don't require singleton elimination for acc.rec
. In the past, we defined div
and gcd
in that way.
Maybe these are less slick from a functional programming perspective, but to me, it seems a lot more natural than computing with elements of Prop
.
But if we encode vec A n as lists with length n, it seems to me that we can design our algorithms in ways that avoid casts.
We would still need casts. BTW, I tried to write vec A n
using the subtype approach.
It is not as nice (https://github.com/leanprover/lean/blob/master/library/data/vec.lean).
We can try to make it nicer with better automation.
Similarly, there are ways defining well-founded recursions that don't require singleton elimination for acc.rec. In the past, we defined div and gcd in that way.
They are less powerful and less convenient to use.
I don't see how we could define find_x
(https://github.com/leanprover/lean/blob/master/library/data/nat/choose.lean) without using acc.rec
.
In the approach we used in the past for div
and gcd
we had to provide a measure.
That is, how many "recursive" calls we are allowed to make. It is like estimating the amount of "fuel" we need to make sure we can compute the result.
For div
and gcd
, it is easy to compute how much "fuel" we need.
This is not the case for find_x
.
Maybe these are less slick from a functional programming perspective, but to me, it seems a lot more natural than computing with elements of Prop.
This is why I was excited about OTT. In OTT, we don't have to compute with elements of Prop
.
coe
computes independently of the structure of the proof-term. The trick for acc.rec
is similar. acc.rec
would also compute independently of the structure of proof-terms.
These points are all well taken. But let me just say that find_x
is exactly the sort of thing I would prefer to avoid. I don't want to compute a natural number by proving it exists and then searching for it. That seems like a bad way to write programs. The natural way to compute a natural number is to describe an algorithm to compute it.
find_x
is just an example. It shows we can write a recursive definition when we have a proposition that guarantees that it will terminate. The same approach can be used to define other functions.
BTW, if we have the acc.rec
trick described above, we would also be able to use even classical reasoning to justify that a recursive function terminates. acc.rec
would not get stuck.
Here is extra motivation for supporting OTT in Lean. Consider the type class described at #650
inductive finite_set [class] (xs : set T) :=
| mk : ∀ (fxs : finset T), to_set fxs = xs → finite_set xs
A set T
is used as a parameter for the type class.
Now, given h : s2 = s1
, fxs1 : finite_set s1
, a term such as
(eq.rec_on h fxs1) : finite_set s2
will appear in formalizations using this type class.
This term is performing a "cast" using an equality between sets.
Since, most equalities between sets are derived using funext
(and consequently quot.sound
),
the term above will get stuck during evaluation.
The term above may occur in a term such as
@card A s2 (eq.rec_on h fxs1)
This is not a unreasonable term, it computes the size of s2
using fxs1 : finite_set s1
and a proof h
that s2 = s1
. However, even if this term is closed, the evaluation will get stuck reducing eq.rec_on h fxs1
at
eq.rec fxs1 (quot sound ...)
Additional remark: just checking if all parameters of an inductive datatype declaration have been used is not sufficient to prevent the inconsistency derived at https://gist.github.com/leodemoura/fadcc0b6a916e76510de
We can still derive false by "exploiting" the fact that Prop
is impredicative:
open eq
inductive I (F : Type₁ → Prop) : Type₁ :=
| mk : (∀ (A : Type₁) (p : F A), p = p) → I F
-- The declaration above is accepted because
-- (∀ (A : Type₁) (p : F A), p = p) is a proposition,
-- and "lives" in Prop (i.e., Type.{0})
axiom InjI : ∀ {x y}, I x = I y → x = y
definition P (x : Type₁) : Prop := ∃ a, I a = x ∧ (a x → false).
definition p := I P
lemma false_of_Pp (H : P p) : false :=
obtain a Ha0 Ha1, from H,
subst (InjI Ha0) Ha1 H
lemma contradiction : false :=
have Pp : P p, from exists.intro P (and.intro rfl false_of_Pp),
false_of_Pp Pp
@codyroux suggested the following syntactic restriction: all parameter and index types appear negatively in the type of one constructor (i.e., positively in the type of one of the arguments). This restriction covers list A
, vector A
, tree A
, etc. However, we would not cover a functor-like type
inductive F (A B : Type) :=
| mk : (A -> B) -> F A B
@codyroux suggested the following syntactic restriction: all parameter and index types appear negatively in the type of one constructor (i.e., positively in the type of one of the arguments).
I might misunderstand something, but how does that help? We can easily modify your contradiction proof such that the parameter F
appears negatively in the constructors (since the constructor is not used at all in the proof). For example:
inductive I (F : Type₁ → Prop) : Type₁ :=
| mk : F unit → I F
@fpvandoorn You are right, it doesn't help.
I was thinking about the following alternative.
Instead of relying on inductvie datatype injectivity, we derive the equations we need from the constructors.
For example, given I F_1 = I F_2
, and your declaration, we can deduce that F_1 unit = F_2 unit
.
It seems more compatible with a set-theoretic interpretation where types are interpreted by sets, and the type I
is being generated by F unit
instead of F
.
Moreover, the equality F_1 unit = F_2 unit
is the one we need to implement the coercion reduction rule for I.mk
.
I think we just need to be careful with empty types. For example, given
inductive I (A : Type) : Type :=
| mk : A -> false -> I A
and I A_1 = I A_2
, there is no reason to believe that A_1 = A_2
.
I think we can avoid this issue by guarding the generated equalities with witness for the argument types. Here is another example, suppose we have
inductive I (a : A) :=
| mk : Pi (b : B a) (c : C a b), I a
Then, we get the following axioms
I.mk.eq.1 {a_1 : A} {a_2 : A} (b : B a_1) (c : C a_1 b) (e : I a_1 = I a_2)
: B a_1 = B a_2
I.mk.eq.2 {a_1 : A} {a_2 : A} (b : B a_1) (c : C a_1 b) (e : I a_1 = I a_2)
: C a_1 b = C a_2 (coe (I.mk.eq.1 b c e) b)
With them, we can generate the following coercion-reduction rule
-- Given {a_1 : A} {a_2 : A} (b : B a_1) (c : C a_1 b) (e : I a_1 = I a_2)
coe e (@mk a_1 b c) ==> @mk a_2 (coe (I.mk.eq.1 b c e) b) (coe (I.mk.eq.2 b c e) c)
@avigad find_x
is sometimes seen as a form of Markov's principle. One then thinks of Prop as the double-negated propositions. Of course, find_x can be defined (in HoTT) by using the axiom of description.
I am not sure whether this solves anything, but it may give a slightly different perspective.
@leodemoura Thanks for the clear description of the problem. Off hand, I don't see a solution, but there is a bit of development in cubical they may help. Once I understand it, I'll let you know.
@spitters I know find_x
is a form of Markov's principle. My point is that Markov's principle does not make for good programs, and I would not mind giving it up.
@leodemoura's point is well taken, though, that it is generally inconvenient to have to provide an "external" measure.
As far as the injectivity paradoxes, it seems to me that they are all variants of the proof that given any set S, there is no injection from the power set of S to S.
I think it is possible to make sense of the idea of interpreting types as "labelled sets", so that inductive types are labelled by the sets the parametrize them. The paradox shows that if the types of the parameters are all in Type_{n}
, we have to put the inductive type in Type.{n}
, to make sure we have enough labels. For example, if one of the parameters is F : Type.{1} -> Prop
, we have to put the inductive type in Type.{2}
, because Type.{1} -> Prop
is in Type.{2}
.
I think one can identify special cases where we do not have to go up a type. For example, for parameters A : Type.{1}
, or A : B -> Type.{1}
with B : Type.{1}
, there should be enough labels in Type.{1}
. That will let us keep list A
in Type.{n}
when A
is in Type.{n}
.
In short, for types without indices, I think we can avoid the injectivity paradoxes just by increasing the type level of some inductive types.
Do you think this is good enough?
Do you think this is good enough?
It depends on the special cases we can identify.
We don't want to bump the universe level of list
, sigma
, vector
, sum
, tree
, prod
, ...
As you wrote above, we want list A
and list (list A)
to be both in Type.{n}
when A
is Type.{n}
.
Similarly, we want nat x nat x nat x nat
to be in Type.{1}
instead of Type.{4}
.
For example, the following definition would not work if we bump universe levels.
open nat
definition tuple (A : Type) : nat → Type
| 0 := unit
| (n+1) := A × tuple n
The trick of replacing H
in acc.rec F H
with
acc.intro x (fun (y : A) (H2 : R y x), acc.inv H H2)
also addresses issue #722. However, as described above, the convertability checker may loop.
In the comments above, we have that the following stuck term
@card A s2 (eq.rec_on h fxs1)
We can use the following trick to avoid that. First, we define the following auxiliary "cast"-like function.
definition finite_set_cast {A : Type} {s₁ s₂ : set A}
: s₁ = s₂ → finite_set s₁ → finite_set s₂
| h (finite_set.mk fs eqv) := finite_set.mk fs (eq.rec_on h eqv)
Now, consider the following example
definition s₁ : set nat := λ a, a ∈ [1, 2, 3]
definition s₂ : set nat := λ a, a ∈ [3, 2, 1]
lemma s₁_eq_s₂ : s₁ = s₂ := ...
definition fs₁ : finite_set s₁ := ...
eval @card nat s₂ (eq.rec_on s₁_eq_s₂ fs₁)
eval @card nat s₂ (finite_set_cast s₁_eq_s₂ fs₁)
The first eval
gets stuck as discussed in the previous comments. However, the one using the finite_set_cast
custom "cast" function doesn't.
This is a nice idea: you change the proof of correctness, but you leave the data alone.
The trick seems general. For example, you can use it with the class differentiable f x
discussed in this thread:
https://groups.google.com/forum/#!topic/lean-discuss/yiIU2cG93v0
Given H : x = y
and even H' : forall x, f x = g x
, we can cast an element of differentiable f x
to an element of differentiable g y
without blocking computation.
I'm closing this one. It has been inactive for several months. My current position is: OTT would marginally improve the user experience.
After we added
quotient
types, we lost a nice property: every closed term evaluates to a value. We have "stuck" terms such aseq.rec H1 (quot.sound H2)
that do not evaluate to a value. Here is a real example demonstrating this issue: https://gist.github.com/leodemoura/fadcc0b6a916e76510deAnother issue is how to justify propositional erasure. For example, suppose we want to implement an efficient evaluator for closed terms that erases propositions. This evaluator would manage to reduce the term
@eq.rec (finset nat) s₁ (λ x, num) 0 s₂ dummy_eq
(in the previous link) to0
. Now, assume we want to use the fast evaluator to perform proofs by reflection. It feels weird that the fast evaluator would be able to "compute" more than the standard one. Moreover, the fast evaluator allows us to type check terms that do not type check with the default evaluator.Note that equality is not the only proposition that can produce "stuck" terms. In general, any proposition that can eliminate into
Type
can produce "stuck" terms (i.e., the ones that support singleton elimination) Equality is one of them,eq.rec
(Lean'sJ
rule) allows us to eliminate intoType
. Another example is the accessibility predicateacc
used to define well-founded recursion (https://github.com/leanprover/lean/blob/master/library/init/wf.lean#L9).OTT seems a natural choice for addressing (some of) these problems. If we manage to implement it, we would have a solid story for justifying propositional erasure, and we would get a lot of nice properties such as a closed term of type
nat
always evaluate to a value (i.e.,0
orsucc^k 0
).As far as I understand, the main points of OTT are:
Prop
is irrelevant (we already have that in Lean); (heterogeneous) equality is a bultin type; add equality reduction rules; replaceJ
rule with acoe
(coercion operator)and have
coe
reduction rules that "compute" independently of the proof term for "A = B". I think this last item is the main insight of OTT. In the OTT paper, they providecoe
reduction rules for a fixed set of type formers: Sigma, W, Pi, ...Now, the first problem: how to support user-defined inductive datatypes? More precisely, given a user-defined inductive datatype, how do we automatically generate the
coe
reduction rules. For simple types such aslist A
, everything looks natural. We would have the following equation andcoe
reduction rules:Note that the first reduction rule is saying
list
is injective, and this property is used to define thecoe
reduction rule. That is, the termcoe A B H ha
is only type correct becauselist A = list B
reduces toA = B
. At first sight, it seems natural to assume that inductive types are injective. However, in general, this is unsound. The following link has a counterexample: https://gist.github.com/leodemoura/0c88341bb585bf9a72e6 In this link, we derivefalse
by assuming that an inductive datatype that has a "unused parameter" is injective. This issue has been discussed in the Agda mailing list (https://lists.chalmers.se/pipermail/agda/2010/001530.html). There is a similar thread in the Coq mailing list (http://comments.gmane.org/gmane.science.mathematics.logic.coq.club/4348). So, the issue is: can we have a simple syntactic check for establishing when is safe to assume that an inductive datatype is injective? The check must cover all useful inductive datatypes. By "useful", I mean it should cover at least the ones used in the Lean standard library. BTW, it seems Agda is now rejecting any inductive datatype declaration that contains unused arguments. However, I did not find any rigorous treatment of this issue (when it is safe to assume inductive datatypes are injective).Second problem: indexed families. In Lean (like most other systems), we support indexed families. It is not clear how to derive the
coe
-reduction rules for indexed families such as:The problem here is that
++
is not injective. So, I don't see how we can define the coercion reduction rule forapp
. That is, given an equality(H: Lam (s1 ++ s2) = Lam (s3 ++ s4))
, to move the coercion inside of the app constructorwe need
(H1 : Lam s1 = Lam s3)
and(H2 : Lam s2 = Lam s4)
, but these two equalities do not follow from H :-( So, as far as I understand, this kind of inductive family cannot be supported in OTT.Another critical issue is how to support the accessibility predicate
acc
. Its eliminatoracc.rec
also eliminates intoType
. One option is to assume thatacc
is also a builtin operator, and make sure thatacc.rec F H
"computes" independently of the structure of(H : acc R x)
. That is,H
may not reduce toacc.intro ...
(the constructor ofacc.rec
). One possible trick is to observe thatH : acc R x
is definitionally equal (by proof irrelevance) toSo, whenever the evaluator is stuck at
we can simply replace
H
withacc.intro x (fun (y : A) (H2 : R y x), acc.inv H H2
, and the computation will continue. We cannot use this trick on open terms because the type checker could loop even when we restrict the trick to the case whereF
is a lambda (e.g., the type checker could loop in a context containing false assumptions). So, our idea is to restrict the trick above to closed terms. In this way, at least, we can claim that closed terms do not get "stuck". We still need a more careful analysis of this issue.Finally, we would have to eliminate "singleton elimination" from the Lean kernel.