crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.42k stars 1.62k forks source link

Union types with Generics causes type-checking bugs #9797

Open jgaskins opened 4 years ago

jgaskins commented 4 years ago

This issue seems like it might overlap with #3803, but I'm not sure because this one isn't about inheritance. Although, I'm not sure if the type system considers union types to be a form of inheritance.

With the following code:

alias MyValue = Int32 | String | Array(Int32 | String)

def foo(bar : MyValue)
  bar.is_a? MyValue
end

foo(12) # => true
foo([12]) # => false

It returns true for the Int32 input but false for the Array(Int32) input. But if it were not that type, that line of code could not run. Through some experimentation, it seems to be the fact that it's a generic. Swapping Array with my own generic class does the same thing.

alias MyValue = Int32 | String | Thing(Int32 | String)

class Thing(T)
end

def foo(bar : MyValue)
  bar.is_a? MyValue
end

foo(12)
foo(Thing(Int32).new)

I run into this a lot and usually I get around it with arrays by using array.map(&.as(InnerType)).as(OuterType). I'm generally not too worried about copying them to satisfy the type checker because they're usually pretty small. But for the use case I'm looking at right now, these arrays can be arbitrarily large so I'd prefer to avoid that if possible.

I tried using unsafe_as, assuming that the ABI would align correctly since the value is passed in as the type it's expecting, and well … my compliments for that very accurate method name (it crashed the app with SIGILL). :joy:

asterite commented 4 years ago

An array of T is not an array of T | U. If that were the case you would be able to put a U inside the first array.

You need to do [1] of Int32 | String

I don't think this is a bug?

jgaskins commented 4 years ago

Maybe calling it a bug was a bit presumptuous but it is behavior I wouldn’t expect.

Your explanation makes perfect sense and I’m completely on board with it except that the value matches the method signature. I feel like if an Array(Int32) isn’t an Array(Int32 | String), then an Array(Int32) wouldn’t be a valid argument to the method. That’s the part that I feel like I’m misunderstanding the most.

I think I also assumed that matching the type signature of the method indicated that the compile-time type of bar in this example would be MyValue, but it seems to retain its original compile-time type from outside the method according to typeof. This seems like it’d facilitate duck typing, though, so I don’t think I’m quite as confused by this one, but both of these together made it a little difficult for me to grok, even as someone who’s used Crystal for 3-4 years.

On the bright side, this is only something I’ve ever run into when building libraries like protocol implementations that can take many different types of data, including collections of those types like arrays and hashes. I rarely (if ever?) need to deal with this in application code.

asterite commented 4 years ago

Ah, yeah, that makes sense. The reason is that type restrictions work differently than is_a?. I'm aware of this but I don't know how to fix it. The type system is a bit of a mess right now. I hope other core team members can formally define and change the semantics of the language

asterite commented 4 years ago

This is essentially defining how covariance and contravariance work.

So far, I think generics inheritance, covariance, and recursive aliases are the main things that are really messy in the language.

jgaskins commented 4 years ago

They seem like hard problems, to be sure. I never thought about types the way Crystal does them before I tried Crystal. My main statically typed languages before this were Java, C, and C++, which don't exactly have advanced type systems. 🙂 So even though they're hard problems to solve and the implementations are probably pretty wacky in some places, I love that you've built these ideas into the language because it has absolutely changed my perspective about static types. It's honestly the first time I've found a static language fun! 😃