crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.46k stars 1.62k forks source link

Covariance and contravariance #3803

Open asterite opened 7 years ago

asterite commented 7 years ago

Given:

class Foo
end

class Bar < Foo
end

And an array of Bar:

array = Array(Bar).new

We can ask the question: is array an Array(Foo)?

Two answers are possible:

  1. Yes: if we read from array we get a Bar, which is a Foo
  2. No: if we want to put an element into array, we can't put a Foo, only Bar and its subclasses

Right now the compiler will answer "Yes" for type restrictions:

def foo(x : Array(Foo))
end

foo(Array(Bar).new) # OK

But i will answer "No" for is_a?:

Array(Bar).new.is_a?(Array(Foo)) # => false

This is not intuitive and should be fixed somehow.

This also affects union types, aliases and recursive aliases.

mverzilli commented 7 years ago

is_a? being false sounds anti-intuitive, but maybe it's right and it's coherent with Crystal.

class Foo 
end

class Bar < Foo
end

class Baz
  def zzz_array(something : Array(Foo))
    puts something
  end

  def insert_sth(something : Array(Foo))
    something.push(Foo.new)
  end
end

puts Array(Bar).new.is_a?(Array(Foo)) # returns "false"

array = [Bar.new, Bar.new] of Bar

puts Baz.new.zzz_array(array) # works perfectly fine as expected
puts Baz.new.insert_sth(array) # including this line causes the program to fail compilation.

Crystal always intended to mimic the development experience of duck typing while ensuring safety, and the current behavior honors that philosophy. Maybe we already have the best tradeoff for Crystal and just need to be careful in the way we teach users why is_a? will be false.

asterite commented 7 years ago

@mverzilli You are totally right with the above :-)

It's like Crystal says "OK, I let you pass that Array(Bar) in this Array(Foo) restriction, as long as you use it well then I'm fine with that". Then if you only read from it, that's OK. But if you try to do something wrong, something that would break the type system (like putting a Bar inside an Array(Foo)) then you get a compile error.

Another scenario where this "I'll only yell at you when yo make a real mistake" is this one:

def foo(x : Enumerable(Int32))
  x << 4
end

array = [1, 2, 3]
foo(array)
p array # => [1, 2, 3, 4]

Sure an Array(Int32) is an Enumerable(Int32). And then, because you are actually passing an Array (not just any Enumerable), you can << to it. So there's nothing wrong or "unsafe" with the code above. Now, if you pass an Enumerable(Int32) that doesn't have <<, then you'll get an error. Maybe not very intuitive compared to other languages, but totally safe.

As a side note, I wouldn't write such program (I would probably simply not use a type restriction)

spalladino commented 7 years ago

I think we are missing a point regarding type restrictions: choosing a method overload. The current situation is assuming that all type parameters in generics are covariant:

class Base; end
class Derived < Base; end

def foo(arr : Array(Base))
  "Array(Base) overload called"
end

def foo(x)
  "No restriction overload called"
end

def bar(arr : Array(Derived))
  "Array(Derived) overload called"
end

def bar(x)
  "No restriction overload called"
end

derived_arr = [Derived.new]
base_arr = [Base.new]

foo(derived_arr) # => Array(Base) overload called
foo(base_arr) # => Array(Base) overload called

bar(derived_arr) # => Array(Derived) overload called
bar(base_arr) # => No restriction overload called

This might not make sense for some classes (C# has a nice list, including Comparers and Actions for instance), which should have contravariant type params.

I guess that besides the fact that Crystal will "only yell at you when you make a real mistake" (which might be due for another discussion, since though it's really useful developing apps, it could be a problem when distributing libs), covariance and contravariance do affect dispatch resolution.

mverzilli commented 7 years ago

Those are fair points. I like C#'s approach of adding in and out modifiers to type variables, Kotlin does something similar: https://kotlinlang.org/docs/reference/generics.html.

If we go that way, one important question is what will be the default variance? If I don't provide an explicit annotation, is it covariant, contravariant or invariant?

Or maybe there's a 4th variance taste particular to Crystal (let's call it duckvariant :P) which is in the line of "only yell at you when you make a real mistake", and that could be a sensible default.

spalladino commented 7 years ago

The deferred-yell should be a topic for another discussion, and be consistent across generic and non-generic types. Covariance and contravariance should be required for deciding which overload to pick, not just for throwing errors (or throwing them later).

As for the default variance, I'd go with invariant. My guess is that most generics come from the stdlib, where we can ensure that sensible variances are defined.

asterite commented 7 years ago

Implementing in and out isn't trivial in Crystal. In C# and Kotlin the compiler checks that T only occurs in the return type or in the argument types. Since Crystal doesn't require type annotations I wonder how the compiler will be able to check/enforce that...

bararchy commented 7 years ago

It also, for me at lest, feels like going away from the fun part of Crystal where you don't have to explicitly define types most of the times.

starting to do Int32 a = 5 is going in the wrong direction

david50407 commented 7 years ago

I have a idea from Java that combines type var into the type restriction:

(modified from @mverzilli's example)

class Foo; end
class Bar < Foo; end

class Baz
  def zzz_array(something : Array(T < Foo)) # notice here, that declared a type T which inherits from Foo
    puts something
  end

  def insert_sth(something : Array(Foo))
    something.push(Foo.new)
  end
end

puts Array(Bar).new.is_a?(Array(Foo)) # false
puts Array(Bar).new.is_a?(Array(T < Foo)) # true, Array(Bar) matched Array(T < Foo)

array = [Bar.new, Bar.new] of Bar

puts Baz.new.zzz_array(array) # Pass, Array(Bar) matched Array(T < Foo)
puts Baz.new.insert_sth(array) # Failed, Array(Bar) is not a Array(Foo)

I don't know this is good or bad, and maybe add T <= Foo can also declare for Foo or Foo's subclass.

Or maybe just use Array(Foo+) is better?

BTW, this cannot avoid the problem discussed above, but I think this should be controlled by developer, not the language

spalladino commented 7 years ago

@bararchy this only applies when you choose to define types: the question is how to decide which method overload to choose when parameters are generics. If you never specify types, then you don't have overloads, and this doesn't apply.

@david50407 personally I like the idea. I didn't know it existed in Java, I knew those annotations from C# (via the where keyword). But I'd apply that to generic functions, not the arguments in generic parameters; as it should be a generic type's responsibility to specify whether its args are co or contra variant.

straight-shoota commented 7 years ago

Sry for the duplicate. I didn't find this issue and did not think of covariance as a reason, neither did Gitter point to this. Therefore I believe this behavior is not known very much.

While it is really nice and saves unnecessary conversions it is also very contra intuitive that type definitions are not the same for variables and method signatures and that type differentiation through overloading and is_a?/case are working differently.

My 2cents to get this further: I think some sort of notation to mark expected types as co- or contravariant would be great. @spalladino what do you mean with But I'd apply that to generic functions, not the arguments in generic parameters? And it would be nice to have this working with is_a?, too.

spalladino commented 7 years ago

what do you mean with But I'd apply that to generic functions, not the arguments in generic parameters?

@straight-shoota sorry for the confusion. I meant that I'd use that restriction when defining a type or function, not when using it. Using the Comparer example from C#, the Comparer itself is defined as public interface IComparer<in T> (emph on the in), so you specify in the definition of the class/function that a specific generic argument is covariant or contravariant. You do not specify it when using the Comparer: in the same example, you do not create a new Comparer<T < Foo>.

ghost commented 7 years ago

maybe just go with "always invariant" java missed that because they had no generics and I think it is generally considered to be a flaw (which was not done again in Javas Generics)

One drawback to this approach is that it leaves the possibility of a run-time error that a stricter type system could have caught at compile-time. Also, it hurts performance because each write into an array requires an additional run-time check.

With the addition of generics, Java and C# now offer ways to write this kind of polymorphic function without relying on covariance. The array comparison and shuffling functions can be given the parameterized types

EDIT to say this is somewhat a duplicate of what is written in the Kotlin doc already

HertzDevil commented 3 years ago

IMO we should first proceed with supporting either bounded restrictions (https://github.com/crystal-lang/crystal/issues/6997#issuecomment-450353013) or bounded free vars (https://github.com/crystal-lang/crystal/issues/6997#issuecomment-450363123), so that a transition path exists from non-strict overload resolution to the strict mode. Then later we could issue a warning whenever an argument matches a generic type arg restriction in the non-strict manner. Something like the following:

def foo(x : Array(Int32 | String))
end

# suppose this is added in 1.1
def bar(x : Array(T)) forall T <= Int32 | String
end

# 1.0: okay
# 1.1: okay
# 1.2: warning: non-strict generic type argument matched
foo([1])

# 1.1: okay
# 1.2: okay
bar([1])

But how should we proceed afterwards in accordance with 1.x's backward compatibility guarantees? When do we drop the non-strict semantics completely? My main concern is that as long as non-strict matching stays, other semantic aspects like overload ordering must be consistent with non-strict matching, so we can't do anything to them unless there is some kind of compile-time option to completely disable non-strict matching and/or turn the above warning into an error.

straight-shoota commented 3 years ago

Yeah, figuring out a decent migration path is going to be trouble. But I think we first need to figure out where we want to go, rather then how. It's definitely good to keep the how in mind to ensure it's practical. But IMO we're currently still in an early design stage that we shouldn't worry to much about the final integration process.

erdnaxeli commented 3 years ago

Personally I want type restrictions to be enforced by the compiler independently of the method code and in a type safe way.

Type restrictions

def foo(x : Enumerable(Int32))
  x << 4
end

This code is just plain wrong. The type restriction here should disable any duck typing. I want an Enumerable(Int32), and this type does not implement <<, so it should not compile.

The compiler should complain in the same way it complains with exhaustive case on abstract types. The same argument that "someone else might add a subclass" applies here: someone might actually gives this method an Enumerable(Int32) that is not an array and it will get a confusing error about a missing method while the doc tells him the type is ok.

Generics

It's like Crystal says "OK, I let you pass that Array(Bar) in this Array(Foo) restriction, as long as you use it well then I'm fine with that".

I don't like that for the same reason as above. I want to compiler to tell me if my code is ok with the given type restriction, not if it is ok with the actual implementation. Same argument as the exhaustive case again.

Generic types should be invariant. If a method expect an Array(Foo), giving it an Array(Bar) should not compile. It is surprising the first time you see it but then it is easy to understand why it is wrong.

Maybe we could add this in and out thing to the langage, but I personally think it adds to much complexity.

Inheritance

When using inheritance and overloading a method, parameters must be contravariant and the return type must be covariant. There too it may be a surprise but it is not so hard to understand.

class Foo
  def something
    puts "foo"
  end
end

class Bar < Foo
  def something
    puts "bar"
  end
end

class Parent
  def something(a : Bar) : Foo
    puts "parent"
    Foo.new
  end
end

class Child < Parent
  def something(a : Foo) : Bar
    puts "child"
    Bar.new
  end
end

def f(x : Parent)
  x.something(Bar.new)
end

f expects a Parent but I give it a Child:

Currently this does not work as expected: This is already supported:

f(Parent.new)  # => parent, ok
f(Child.new)  # => child, ok

The compiler dispatches something to Parent even if the variable is actually a Child, I suspect it is because Bar is more specialized than Foo.

If we make Parent an abstract class with an abstract method, we see even better what is wrong. The compiler complains that:

Error: abstract def Parent#something(a : Bar) must be implemented by Child

But our implementation is actually correct as it accepts Bar object and returns (Bar objects that are) Foo objects.

straight-shoota commented 3 years ago

@erdnaxeli I don't follow the significance of the inheritance example for variance. Foo and Bar are completely unrelated types, so #something(a : Foo) and #something(a : Bar) are just overloads sharing the same method name. But neither is a restriction of the other.

HertzDevil commented 3 years ago

Now that I think about it, due to #8973 the following two defs will not be equivalent:

def foo(x : Array(Array))
end

def bar(x : Array(T)) forall T <= Array
end

foo([[1]]) # okay
bar([[1]]) # okay, T = Array(Int32)

foo([[1 || "a"]]) # okay
bar([[1 || "a"]]) # okay, T = Array(Int32 | String)

foo([[1] || ["a"]]) # okay
bar([[1] || ["a"]]) # okay, T = Array(Int32) | Array(String)

foo([[1]] || [["a"]]) # okay
bar([[1]] || [["a"]]) # error, `Array(T)` is never a union
straight-shoota commented 3 years ago

It seems a bit disproportionate that Array(Array) matches a union but Array(T) does not.

Maybe this (already currently observed) behaviour needs some refinement? It's not very coherent:

def bar(x : Array(T)) forall T
end

bar([[1]] || [["a"]]) # Error: no overload matches 'bar' with type (Array(Array(Int32)) | Array(Array(String)))
                      # Overloads are:
                      #  - bar(x : Array(T))
                      # Couldn't find overloads for these types:
                      #  - bar(x : Array(Array(String)))

The error message claims there's no overload bar(x : Array(Array(String))). But bar([["a"]]) compiles. So the error is incorrect, or at least imprecise.

erdnaxeli commented 3 years ago

@erdnaxeli I don't follow the significance of the inheritance example for variance. Foo and Bar are completely unrelated types, so #something(a : Foo) and #something(a : Bar) are just overloads sharing the same method name. But neither is a restriction of the other.

Argh. Obviously in my example Bar should be a subclass of Foo. With the code fixed, the dispatching is ok :)

Silentdoer commented 1 year ago

Mark