gracelang / language

Design of the Grace language and its libraries
GNU General Public License v2.0
6 stars 1 forks source link

Sequence Constructors, Blocks can't be annotated with types #175

Open apblack opened 5 years ago

apblack commented 5 years ago

We have a notation for a sequence constructor

[exp_1, exp_2, exp_3]

but we don't have a way of giving it a type. Do we want to add one? The "obvious" syntax would be

[exp_1, exp_2, exp_3]⟦ElementType⟧

For example,

["one", "two", "three", "four"]⟦String⟧

Note that the expressions inside the sequence constructor can be anything at all, so if we say "we should infer the types", then we are committing to inference everywhere — including inferring the result types of methods on arbitrary objects. If we do that, we have Haskell or ML, not Grace. (Is full type inference on an OO language even decidable?)

An alternative is adding type information to sequence constructors to say that the type of a sequence constructor is always Sequence⟦Unknown⟧. If you want to assert otherwise, you can convert it:

[exp_1, exp_2, exp_3] >> sequence⟦String⟧

which has type Sequence⟦String⟧ but will in general require a dynamic check on each element. Of course, a decent implementation (not minigrace) would get rid of the dynamic check in the case where the exp_n are statically known to be Strings.

apblack commented 5 years ago

Blocks also can't be typed. We can put type annotations on the parameters of a block, but not on the result. The above idea could be extended to blocks:

{a:Number, b:String → ...}⟦Boolean⟧

would be a block with type Function2⟦Number, String, Boolean⟧

KimBruce commented 5 years ago

In statically typed languages, return types of functions are inferred and then checked against the declaration. I.e., the return type is redundant and mainly there for documentation and to make sure that the inferred type is the one the programmer intended. This would be true as well of a statically-typed Grace.

KimBruce commented 5 years ago

... In a purely statically typed dialect, types of blocks and sequences should be able to be inferred (aside from the empty list). If the dialect doesn't do full static typing then can infer return type of block or sequence as unknown if insufficient information.

apblack commented 5 years ago

I was thinking about this after our meeting was over. The problem I see is that we have defined the types of method results to depend on the type arguments, not on type inference. For example, in

def seq = sequence⟦Number⟧ [2, 4, 6, 8]
def max = seq.at 5 ifAbsent { 10 }

we can infer that the type of seq is Sequence⟦Number⟧ because its value has that type. But as far as I can see, max has to have type Number | Unknown (when we would like it to have type Number).

Why is this? The type of at(_)ifAbsent(_) is defined in Sequence⟦T⟧ as follows:

at⟦W⟧(index:Number) ifAbsent(action:Function0⟦W⟧) ->  T | W 

Where T = Number, but In the absence of an explicit type argument for W, this is Number | Unknown.

Yes, we can easily infer that { 10 } has type Function0⟦Number⟧. But that's irrelevant: we don't have a way of saying that the return type of at⟦W⟧(_)ifAbsent(action) depends on the type of the action argument. All we can say is that it depends on W. I raised this problem in issue #156, where I proposed the keyword forAll to introduce in implicit type parameter. I observed that we can make a type depend on the value of a type parameter, but not on the type of a value parameter, which is what is needed here. (Emerald allowed both). I still think that this is going to be a problem.

Because [...] is a language construct, we can give it a dependent typing rule. But the requests sequence⟦T⟧(elts), list⟦T](elts), etc., are not: they are just method requests, and we have no way of saying that the result of sequence⟦T⟧(elts) depends on the type of elts, only that it depends on the value of T.

We get the same problem when constructing collections. If we don't provide an explicit type argument to the class, and instead define something like

def lst = list [2, 4, 6, 8]

then it doesn't help us to infer that the sequence constructor has type Sequence⟦Number⟧. Because there is no type argument provided to the list class, it has to return an object with type List⟦Unknown⟧

kjx commented 5 years ago

we can make a type depend on the value of a type parameter, but not on the type of a value parameter, which is what is needed here. (Emerald allowed both).

static or dynamic type?

I still think that this is going to be a problem.

I think there will be lots of problems if we start doing static typing properly.

def lst = list [2, 4, 6, 8]

If I keep sayjng def lst = list(2, 4, 6, 8) you'll get (even more) annoyed with me. But your point is right: special casing types of [...] things doesn't help that much

apblack commented 5 years ago

we can make a type depend on the value of a type parameter, but not on the type of a value parameter static or dynamic type?

Static type, because it's only statically that we care. So list would have a signature something like

    forall T list (elts:Collection⟦T⟧) -> List⟦T⟧ 

which would mean: the static type of the argument to list must be Collection⟦E⟧ for some E: whatever that E is, bind T to it in the result type List⟦T⟧.

apblack commented 5 years ago

The problem with this proposal is that the dynamic semantics of the language would depend on the exact type inference rules, and thus the type inference rules would need to be part of the language definition.

This is probably true if we allow inference at all.

kjx commented 5 years ago

It's not a question of allowing inference. A dialect can infer whatever it likes, and check whatever it infers however it wants.

The question is (a) can the type inference affect the operational semantics, i.e. the runtime behaviour, and (b) does that mandate type checking/inference before programs can be run. I think we are sliding towards wanting "a)" (which is a significant complication to the language) but that most techniques for achieving (a) also mandate (b). I don't know if anyone has looked at some kind of local type inference semantics that could still run globally incorrect programs. If they haven't, I think that's an interesting bit of research and have some ideas how to do it..

KimBruce commented 5 years ago

I'm beginning to think that we should make sure all of these expressions can take explicit type parameters, and leave it up to dialects to determine which ones can be inferred.

The at()ifAbsent(), for example, can take a type parameter for the type returned from the block parameter (though I suspect most of the time it will just be the type of the elements in the collections -- and in many cases it will just raise an exception).

In the base (non-statically typed) version of the language type parameters can always be deleted, so it shouldn't muck up programs there. I can imagine a base static-typing dialect that would require all type annotations and another that does some simple inference.

It does mean though that we need to admit syntax like

["one", "two", "three", "four"]⟦String⟧
apblack commented 5 years ago

I'm beginning to think that we should make sure all of these expressions can take explicit type parameters, and leave it up to dialects to determine which ones can be inferred.

The problem with this idea is that the dialect is now changing the program — replacing the implicit ⟦Unknown⟧ arguments by implicit ⟦inferred⟧ arguments. This will change the behaviour of running programs, something that we have so far agreed that dialects should not do. Having a dialect take out a check that it has proved redundant is one thing; this is quite another.

kjx commented 5 years ago

I fear once again we are getting confused by the difference between static and dynamic type checked.

The current spec says that omitted type parameters become Unknown (at runtime).

A static checker could do whatever it liked to infer tighter types and report static errors based on those types - provided that it does not change the program to bind those parameters at runtime (or otherwise runtime behaviour now depends on types). (I don't think so-called "sound" gradual typing schemes have to do this, but I think they could. Monotonic can narrow types at runtime)

So a static checker could certainly raise a type error in the code below, on the grounds that the seq really has static type Seq⟦String⟧:

def listOfNumber : List⟦Number⟧ = list ["a", "b", "c"] 

But our principles say it shouldn't change the code that gets run, because that would cause this (evil) code to break:

def listOfStuff : List⟦Unknown⟧ = list ["a", "b", "c"] 
listOfStuff.add(0)  //assumes listOfStuff is List⟦Object⟧ or List⟦Unknown⟧

The inference rule above would make the listOfStuff object a list of Strings, as if you'd written this:

def listOfStuff : List⟦Unknown⟧ = list⟦String⟧ ["a", "b", "c"] 
listOfStuff.add(0)  //assumes listOfStuff is List⟦Object⟧ or List⟦Unknown⟧

This will fail a dynamic type test at runtime, but I think must always pass a static type checker.