Closed davidweichiang closed 1 year ago
Am I understanding this correctly:
Let A, B, ... stand for datatype names (without their parameters).
Let t, u, ... stand for types.
Define A -> B iff there is a data declaration data A \alpha ... = rhs, and B occurs in rhs.
Define t => u iff there is a data declaration data lhs = rhs, and a substitution S, such that S(lhs) = t and u occurs in S(rhs).
A type t is recursive iff t => t and it is infinite iff t => u => * u for some u.
The original code used the fact that type A t1 ... is infinite iff A -> B -> B for some B. (That seems true, though I would feel better if I saw a proof.)
It's not the case that type A t1 ... is recursive iff A ->* A; the list in #70 with first element of type a and subsequent elements of type Bool is a counterexample. DR.hs uses this test to decide which types to eliminate, but it is moot because it comes after monomorphization and there are no type parameters by that point.
The original code considered Box (Box Bool) to be non-robust because it does cons y to visited when visiting y's arguments. I think this was a bug (#68).
I changed the checks for recursive types (except in DR.hs) to use =>, not ->. I probably made the code slower in the process. Now I believe it doesn't matter whether to cons y to visited when visiting y's arguments.
I feel that the code in this PR sticks closer to (what I imagine is) the definition of recursive/infinite types. But using -> is probably a little simpler and faster, so I could switch it back if needed.
I think I changed my mind and will revert to using ->. @ccshan any insights?
What I really wanted originally was to explicitly insert folds and unfolds during elaboration, and then subsequent phases would not have to check for recursiveness again. But that would be a lot more work, right?
Before reading the code, let me say some things y'all may already know:
Indeed DR happens after monomorphization and should treat "List Bool" as one word. Now, there are multiple ways to translate mutually recursive data type definitions into types using mu. For example
data List1 = Nil1 | Cons1 Bool List2
data List2 = Nil2 | Cons2 Bool List1
can be translated either by putting the mu/fold/unfold on List1
(and treating List2
as 1+Bool*List1
, which contains mu but not at the top level), on List2
, or on both. Of course the user can force the first translation, say, by writing
data List1 = Nil1 | Cons1Nil2 Bool | Cons1Cons2 Bool Bool List1
So I think it's ok to treat every type that's part of a loop (i.e., a recursive type as defined above: t =>* t) as something that DR must find a way to eliminate.
As for monomorphization, I should read up on it but I think the basic idea is to keep creating new instances of types and global definitions until the types required by those instances in turn are all already dealt with? And the main concern with recursion there is that it might cause monomorphization to not terminate? And a simple restriction that ensures that would be that whenever creating an instance (of a type or global definition) necessitates another instance (of the same type or global definition), the two instances must be the same?
I propose we deal with infinite instances (#70) separately from this PR.
I changed my mind back and think that this PR is ok as it stands. :)
Just a note for future reference: With the merging of #80, I think there's only one case left where the more fine-grained check for recursion here is needed, tests/good/type_parameter_different.ppl. If it weren't for that, I think it would be enough just to look at the datatype names and not do any substitution, like @HerbertMcSnout originally wrote it.
type_parameter_different.ppl is a pretty pathological case for #80 as well, so I think it would also make sense to just forbid it and simplify the functions here.
This closes bug #68 (non-recursive type being incorrectly flagged as non-robust) and a TODO that was in the source code (merge two functions for checking whether a type is recursive).
It also fixes a small bug noted in the line comments -- please see.