Open mmcdon20 opened 2 months ago
The reason that this is not allowed is that type aliases are not nominative, but structural. They do not exist as types, only as ways to express types.
An alias like
typedef LinkedList<T> = (T head, LinkedList<T>? tail);
doesn't introduce a new type, only a name for an existing type.
So what is the type of the second field of that record type?
Which actual argument can I call a ChurchNum
with?
We don't know. This definition doesn't define a type, but a recursive constraint on a type. We have to define how to solve that equation.
The obvious answer is to give the fixed point. The only problem is that that is an infinite type, and we need to represent it in finite memory, and time, so we can't just unfold it.
The other option is to introduce fixed point operators in the type model. That means introducing a structural type construction like:
πππ© T = (T head, LinkedList<T>? tail);
That is defined to be the (least?) fixed point of that equation. Which is an infinite, but regularly recurring, type.
We have to derive that type from a declaration like:
typedef F<X, Y> = Map<X, F<String, List<Y>>>;
by ... probably being clever and seeing which parameters occur recursively. This one could be recognized as equvalent to a directly specified fixed-point:
typedef F<X, Y> = Map<X, πππ© $F<Y> = Map<String, List<$F<List<Y>>>>>;
Maybe.
(Brings up the point that there are multiple ways to define the same type. Fx.
πππ© F = List<F>
and List<πππ© F = List<F>>
are equivalent, which can be seen by one unfolding.
Are:
typedef B = List<C>;
typedef C = List<B>;
and equivalent to
typedef L = List<L>;
? Sure. But how easy is it do detect it?)
The defintion would have to not be directly recursive. A πππ© T<X> = T<List<X>>
is not allowed, since the fixed point of that is empty. However πππ© T<X> = List<T<X>>
would be fine, it adds an outer type construct, then only recurses inside, something nominative types can sometimes do as well (abstract class C<T> implements List<C<T>> {}
is allowed).
Then we need to define what every function we have on types means for this type: subtyping, normalization, least upper bound, and assig an interface to the type (which the "no direct recursion" helps with, since that will likely give an outermost type).
And it's a real type, which can exist at runtime as the reified your argument of, say, a list, so we need to do subtyping tests at runtime too.
And we have to figure out what it means if someone combines more than one of those with each other.
That's a possible approach. It might be undecidable. I'm not enough of a type theorist to know, but it looks too much like lambda calculus to not make me fear that it's Turing complete. It's definitely complicated at a very high level.
My case:
typedef State = State? Function(Parser parser);
State state = fragment;
while (index < length) {
state = state(parser) ?? fragment;
}
The recursive types that actually work are usually the ones including functions, nullability or collections.
Functions because parameter types are contravariant, and you don't have to call it, and it doesn't have to return its return type (it can throw).
Nullability because for general union types, it works when one of the parts is a non-recursive type. Nullable types is that kind of union type (and FutureOr
isn't).
Collections because they can be empty.
That's all examples of not having to actually produce a value of a component type of the recursive type, so the recursion can end for an actual value.
Saying typedef TurtlesAllTheWayDown = List<TurtlesAllTheWayDown>;
isn't particularly useful, the only solutions to that equation are Never
, List<Never>
, List<List<Never>>
, etc., so you never get to actually have a useful value in there. But it's not an empty type because <Never>[]
works.
Saying typedef Void = ({value: Void});
just defines an empty type (no values have the type). The RHS has an unconditional covariant occurrence of the recursive type, so creating an instance of the type requires having an instance of the type.
The definition typedef Void = Void;
is just as empty, if we allow that. If we see a type definition as a set of acceptable types, we'd create that set inductively starting with just {Never
}, and then the solution here is just Never
.
So, there are some uses of recursive types that make sense. Not as many as one would hope, especially without general union types, but thaving one that involves a function isn't surprising.
You can have your recursive type today, if you're willing to put a class in the middle. Say:
class State {
final _State? Function(Parser) _callback;
State(this._callback);
State? call(Parser parser) => _callback(parser);
}
State fragment = State(_fragmentCallback);
// ...
State state = fragment;
while (index < length) {
state = state(parser) ?? fragment;
}
Nominative types (class types, really, because extension types have problems with self-references too) is the officially recommended way to have self-referential types.
It works because the compiler doesn't need to unfold the recursion until it reaches an object of that type, and the type system ensures that the value will satisfy the type. The object itself knows its precise type (instance of that class, type arguments are this and that), which tells everything that is needed about the type. There is no need to check component values, it's entirely a type comparison operation.
For structural types, there is no inherent type to compare against. Well, not for records at least, functions do have an inherent function type. And union types have as much type as the actual value. So it's records. Still, the compiler needs to be able to check component values against the unfolded type, so it needs to unfold the recursive type. Then the problem becomes stopping again. And if any of the components have a recursive type, we need to match them up, somehow. (At least instances of structural types are usually immutable, so we can't create structurally typed object cycles, which would require a recursive structural type to represent the actual runtime type.)
Recursive structural types are not necessarily impossible. Heck, the Trine language (the programming language used in CS 101 when I studied) had recusive structural typing, and was able to recognize that List<πππ© T=List<T>>
and πππ© T=List<T>
was the same type, and the same as πππ© T = List<List<T>>
, after doing some unwinding and cycle normalization. (And, in some cases, an exponentially large error message when two types weren't equal, which suggests that it wasn't always an effective algorithm. Or maybe that was the generic modules, it's been a while.)
Subtyping of recursive types is well studied. It's also very complicated for the user to reason about, and one can think of object oriented type systems as a way to dodge the issue by encapsulating the recursive references. Adding proper recursive types back into an object oriented language will raise the complexity again. So not something that is a slam dunk, even if possible and expressive, it might just be too complicated for the common user, and the workarounds are "good enough".
(We did manage to allow class C<T extends C<T>> {}
which has an infinitely unfolding bound. And our instantiate-to-bounds algorithm gives up precisely for that kind of F-bounded polymorphism, because we don't have recursive types. With them, it could solve the instantiate-to-bounds as C<πππ© T = C<T>>
. But we can also just keep doing what we're doing today, and use a super-bounded type as solution. It's not precise, but it's also not wrong.)
I was curious if there were any other languages that support a recursive typedef like in this proposal.
I found that most languages which support a typedef do not support exactly these recursive definitions (although in some cases the language may have some alternative solution).
// illegal cyclic type reference: alias ... of type LinkedList refers back to the type itself
type LinkedList[T] = (T, Option[LinkedList[T]])
// illegal cyclic type reference: alias ... of type Church refers back to the type itself
type Church = Church => Church
// Recursive type alias in expansion: LinkedList
typealias LinkedList<T> = Pair<T, LinkedList<T>?>
// Recursive type alias in expansion: Church
typealias Church = (Church) -> Church
-- Cycle in type synonym declarations:
-- Main.hs:8:1-45: type LinkedList a = (a, Maybe (LinkedList a))
type LinkedList a = (a, Maybe (LinkedList a))
-- Cycle in type synonym declarations:
-- Main.hs:7:1-30: type Church = Church -> Church
type Church = Church -> Church
// type alias 'LinkedList' references itself
typealias LinkedList<T> = (T, LinkedList<T>?)
// type alias 'Church' references itself
typealias Church = (Church) -> Church
// generic type cannot be alias
// invalid recursive type: LinkedList refers to itself
type LinkedList[T interface{}] = struct {
head T
tail *LinkedList[T]
}
// invalid recursive type: Church refers to itself
type Church = func(Church) Church
// This type definition involves an immediate cyclic reference through an abbreviation
type LinkedList<'t> = 't * Option<LinkedList<'t>>
// This type definition involves an immediate cyclic reference through an abbreviation
type Church = Church -> Church
// error[E0391]: cycle detected when expanding type alias `LinkedList`
type LinkedList<T> = (T, Option<LinkedList<T>>);
// error[E0391]: cycle detected when expanding type alias `Church`
type Church = fn(Church) -> Church;
// error: use of undeclared identifier 'LinkedList'
template<class T>
using LinkedList = std::tuple<T, LinkedList<T>>;
// error: unknown type name 'Church'
using Church = Church(*)(Church);
(* Error: The type abbreviation linkedlist is cyclic *)
type 'a linkedlist = 'a * 'a linkedlist option;;
(* Error: The type abbreviation church is cyclic *)
type church = church -> church;;
-- This type alias is recursive, forming an infinite type!
type alias LinkedList a = (a, Maybe (LinkedList a))
-- This type alias is recursive, forming an infinite type!
type alias Church = Church -> Church
However, I found that Typescript DOES allow exactly this sort of recursive typedef.
The following is a valid Typescript program:
type LinkedList<T> = [T, LinkedList<T>?];
type Church = (c: Church) => Church;
The following definitions are also allowed in Typescript:
type TurtlesAllTheWayDown = Array<TurtlesAllTheWayDown>;
type Void = {value: Void};
However this definition is not:
type Void = Void; // Type alias 'Void' circularly references itself.
The Haxe language appears to also support the same rules as Typescript for it's typedef feature.
typedef LinkedList<T> = {head: T, tail: Null<LinkedList<T>>};
typedef Church = Church -> Church;
These definitions are also allowed:
typedef TurtlesAllTheWayDown = Array<TurtlesAllTheWayDown>;
typedef Void_ = {value: Void_};
However this definition is not:
typedef Void_ = Void_; // Recursive typedef is not allowed
The following definitions are allowed in Odin. Although I will point out that the Church
definition is not very useful in Odin due to the fact that proc
expressions do not close over any values that are in scope, so with a nested proc
you cannot refer to an argument in an outer proc
from an inner proc
, but the type checking does work.
LinkedList :: struct($T: typeid) { head: T, tail: ^LinkedList(int) }
Church :: proc(Church) -> Church
For TurtlesAllTheWayDown
, technically the type alias appears to be allowed by the compiler, however when you try to create an instance of TurtlesAllTheWayDown
you get a Segmentation fault.
TurtlesAllTheWayDown :: []TurtlesAllTheWayDown
turtles : TurtlesAllTheWayDown = {} // Segmentation fault (core dumped) (exit status 139)
The following is allowed:
Void :: struct {value: ^Void}
But these definitions are not:
Void :: struct {value: Void} // Illegal declaration cycle of `Void`
Void :: Void // Invalid declaration value 'Void'
This language is not free or open source, so I didn't actually try this one out, but it's clear from the documentation that recursive typedef is allowed in this language: https://www.puppet.com/docs/puppet/5.5/lang_type_aliases.html#creating-recursive-types
Given that there does seem to be a handful of languages that support this feature, it seems plausible that it may at least be technically possible for dart to support it.
If anyone is interested in experimenting with recursive typedef I think Haxe and Typescript are probably the best languages to try.
Typescript is probably being clever enough to recognize that the recursion goes through a nullable type or a function type. The reason that matters is that you can create a value of the recursive type without already having a value of the type. (Or it's being naive enough to not recognize that there is a problem, it just never tries to unfold the type until it needs to. It's easier in a language without type soundness.)
type LinkedList<T> = [T, LinkedList<T>?]; // [1, null] is a LinkedList<int>
type Church = (c: Church) => Church; // function(_) { throw "Nope"; } is a Church
type TurtlesAllTheWayDown = Array<TurtlesAllTheWayDown>; // [] is a TurtlesAllTheWayDown
type Void = {value: Void}; // {get value() { throw "nope"; }} is a Void
A type like type Moar<T> = (T, Moar<T>);
requires you to have a value of the type before you can create a value of the type, which means the least fixed point of the type is the empty type.
Doing typedef Moar<T> = (T, Moar<T>?);
means that you can create an object of the type just by having a null
value, and for typedef RecMoar<T> = RecMoar<T> Function(RecMoar<T>);
, a value of (Object? _) => throw "nope"
is an instance of the type, because you can use Never
as the concrete expression of the type without needing to evaluate that expression.
The distinction between nominal type and structural type can also be seen in Haskell.
The type Void = Void
is not allowed, but data Void = VOID(Void)
is. It's a named type with a constructor, like a class in Dart. It's still empty because it's impossible to call the constructor without an existing value of the type it constructs.
Apologies if this is a duplicate, but I don't think there is an existing issue specifically on this topic.
Occasionally I run into situations where it would be nice if a
typedef
could refer to itself by name in its definition.Often this takes the form of a simple data structure like one of the following:
The alternative is to write a class which is significantly more verbose:
Another example, I came across this Rosetta Code problem called Church numerals.
I attempted to translate the java solution to dart as follows:
The above solution works, but is not particularly type safe. It would be better if we could define
ChurchNum
as follows:You can of course implement a type safe solution by defining
ChurchNum
as a class that accepts aChurchNum Function(ChurchNum)
in the constructor, but then you have unnecessary wrapper objects, and a more verbose solution.A third situation where this could be useful is if union types were also added to the language.
With both features you could define a
Json
for example:That said, I think this feature would be useful even on its own.