gracelang / language

Design of the Grace language and its libraries
GNU General Public License v2.0
6 stars 1 forks source link

Union types #111

Open KimBruce opened 8 years ago

KimBruce commented 8 years ago

After staring at things for quite a while I've unfortunately decided that I don't understand "+" types. They make good sense for interface types. If T = interface{m1:T1 -> U1, m2:T2 -> U2} and U = interface{m1:T1 -> U1, m3: T3 -> U3}, then T + U = interface{m1:T1 -> U1}. Note, however, that this is NOT the least upper bound of those two types, as that is T | U, which is <: T + U.

So what is the definition of T + U if T and U are something other than simple interfaces? I.e. suppose T = P1 | P2 and U = Q1 | Q2? The least upper bound of these two is P1 | P2 | Q1 | Q2. However, I suspect that is not what people were meaning when they thought of it.

The spec (section 10.5.3) defines it to be the dual of & and then gives a definition like that above for interface types. The axioms supplied simply indicate that the result is some sort of upper bound. Thus they provide little help in understanding the meaning in the general case.

This came up in my trying to verify that all types have a normal form as a disjunction of interfaces. However, without a meaning for + that was hard to do.

Intersection (&) doesn't seem to provide problems as we can simply use a distributive law:(T1 | T2) & U = (T1 & U) | (T2 & U) as part of a definition (and it seems plausible). I can't convince myself that a similar distributive law for + is plausible, through we could certainly define + on variants that way.

It's worth noting that this has ramifications elsewhere. Suppose T has m:A1 -> B1 and U has m:A2 -> B2. Then what is the signature of m in T & U? We want (T & U) <: T and (T & U) <: U. If the type of m in the intersection is A' -> B' then what are A' and B'? The result B' must be a subtype of both B1 and B2. Thus it can be B1 & B2, but what is A'? It needs to be a supertype of both A1 and A2. Should it be A1 + A2 or A1 | A2?

So, what to do? I suggest that we just drop the sum operator. While it has meaning on interface types, I don't see a clear meaning on other types. We can use it informally on interfaces, but, I don't see a nice definition emerging for more complex types. If we don't need it (and I don't see we do) then we should just drop it.

apblack commented 5 years ago

I think the the problem comes from |, not from +. If I have two types A | B and C | D, how do I decide check to see if A | B < C | D?

I like + because it lets me go as far as possible without introducing |, which describes no object, only incomplete information in a type-checker. It provides an answer to the question that @kim poses in the penultimate paragraph: T & U = interface { m (a:A1+A2) -> B1 & B2 }.

Certainly, having + and | is confusing. @KimBruce's assertion that T + U is not the least upper bound of T and U, because that is T | U left me non-plussed: how can I compare T | U and T + U? They are very different animals.

kjx commented 5 years ago

how can I compare T | U and T + U?

I thought T + U and T & U were the complementary operations...

apblack commented 5 years ago

On reflection, I think that @KimBruce is right. (A | B) is a subtype of (A + B), in other words, A | B is more restrictive than (A + B). If z:(A|B) then it is also true that z:(A+B), but not the other way around.

Hence:

 (A | B) <: (A + B)

A|B is also an upper bound of A and B: if we have two operations m where one returns an A and the other a B, then, after requesting one or other of them, we will have an A | B.

A <: (A | B)
B <: (A | B)

So A|B is at least a lesser upper bound than A+B, and we may as well abolish +. I've long maintained that we needed + to express upper bounds when we create lower bounds with &, but I've been wrong: we can (and should!) use |

This will simplify the "disjunctive normal form" theorem that we need in a type-checker.