diprism / perpl

The PERPL Compiler
MIT License
10 stars 5 forks source link

Merge two tests for recursive types, fix bug in check for recursion #69

Closed davidweichiang closed 1 year ago

davidweichiang commented 2 years ago

This closes bug #68 (non-recursive type being incorrectly flagged as non-robust) and a TODO that was in the source code (merge two functions for checking whether a type is recursive).

It also fixes a small bug noted in the line comments -- please see.

davidweichiang commented 1 year ago

Am I understanding this correctly:

Let A, B, ... stand for datatype names (without their parameters).

Let t, u, ... stand for types.

Define A -> B iff there is a data declaration data A \alpha ... = rhs, and B occurs in rhs.

Define t => u iff there is a data declaration data lhs = rhs, and a substitution S, such that S(lhs) = t and u occurs in S(rhs).

A type t is recursive iff t => t and it is infinite iff t => u => * u for some u.

The original code used the fact that type A t1 ... is infinite iff A -> B -> B for some B. (That seems true, though I would feel better if I saw a proof.)

It's not the case that type A t1 ... is recursive iff A ->* A; the list in #70 with first element of type a and subsequent elements of type Bool is a counterexample. DR.hs uses this test to decide which types to eliminate, but it is moot because it comes after monomorphization and there are no type parameters by that point.

The original code considered Box (Box Bool) to be non-robust because it does cons y to visited when visiting y's arguments. I think this was a bug (#68).

I changed the checks for recursive types (except in DR.hs) to use =>, not ->. I probably made the code slower in the process. Now I believe it doesn't matter whether to cons y to visited when visiting y's arguments.

I feel that the code in this PR sticks closer to (what I imagine is) the definition of recursive/infinite types. But using -> is probably a little simpler and faster, so I could switch it back if needed.

davidweichiang commented 1 year ago

I think I changed my mind and will revert to using ->. @ccshan any insights?

What I really wanted originally was to explicitly insert folds and unfolds during elaboration, and then subsequent phases would not have to check for recursiveness again. But that would be a lot more work, right?

ccshan commented 1 year ago

Before reading the code, let me say some things y'all may already know:

Indeed DR happens after monomorphization and should treat "List Bool" as one word. Now, there are multiple ways to translate mutually recursive data type definitions into types using mu. For example

data List1 = Nil1 | Cons1 Bool List2
data List2 = Nil2 | Cons2 Bool List1

can be translated either by putting the mu/fold/unfold on List1 (and treating List2 as 1+Bool*List1, which contains mu but not at the top level), on List2, or on both. Of course the user can force the first translation, say, by writing

data List1 = Nil1 | Cons1Nil2 Bool | Cons1Cons2 Bool Bool List1

So I think it's ok to treat every type that's part of a loop (i.e., a recursive type as defined above: t =>* t) as something that DR must find a way to eliminate.

ccshan commented 1 year ago

As for monomorphization, I should read up on it but I think the basic idea is to keep creating new instances of types and global definitions until the types required by those instances in turn are all already dealt with? And the main concern with recursion there is that it might cause monomorphization to not terminate? And a simple restriction that ensures that would be that whenever creating an instance (of a type or global definition) necessitates another instance (of the same type or global definition), the two instances must be the same?

davidweichiang commented 1 year ago

I propose we deal with infinite instances (#70) separately from this PR.

I changed my mind back and think that this PR is ok as it stands. :)

davidweichiang commented 1 year ago

Just a note for future reference: With the merging of #80, I think there's only one case left where the more fine-grained check for recursion here is needed, tests/good/type_parameter_different.ppl. If it weren't for that, I think it would be enough just to look at the datatype names and not do any substitution, like @HerbertMcSnout originally wrote it.

type_parameter_different.ppl is a pretty pathological case for #80 as well, so I think it would also make sense to just forbid it and simplify the functions here.