Open gavinking opened 11 years ago
Didn't @FroMage already have a pretty well thought-out design? Or do you have some issues with it?
Eh? A well-thought-out design of what?
First, static
should be the default. Reification is a constraint, just like given
. So have a reify
annotation instead.
Second, I'm not really sure what your optimization is doing.
Third, I still haven't been given the various use cases y'all have in mind for reification, so it's hard to offer suggestions if I don't know why you need it.
A question I was wondering about is symmetry of reified types and equality. Suppose the following:
class Foo<out T>(T... values){
shared actual Boolean equals(Object other){
if(is Foo<T> other){
return values == other.values;
}
return false;
}
}
class Top(){}
class Bottom() extends Top(){}
value b = Bottom();
value foo1 = Foo<Top>(b);
value foo2 = Foo<Bottom>(b);
assert(foo1 == foo2); // true, because foo2 of type Foo<Bottom> is also a Foo<Top>
assert(foo2 == foo1); // false, because foo1 of type Foo<Top> is not a Foo<Bottom>
Am I missing something here or is that going to be problematic?
@FroMage Right. Note that my implementations of equals()
for List
, Map
, Set
carefully ignore the type arguments.
OK.
Further on the idea of "optimized" reification.
A critical issue we need to think through is whether there are soundness issues associated with letting a runtime type argument be a subtype the type argument you would get from purely static analysis. The cases I'm thinking of involve contravariance. Imagine we tried to optimize away T
in the following code:
class Consumer<in T>(T[] seq) {}
Object[] words = ["hello", "world"];
Consumer<Object> = Consumer(words); //oops, unsound!
Here, if T
is String
instead of Object
at runtime (since at runtime words
is a String[]
not just an Object[]
), an instance of Consumer<String>
would be assigned to Consumer<Object>
, which is unsound. The question is whether all cases like this can be detected via local static analysis, and the optimization disabled. I suspect that the answer to this is "yes", since we would only enable the optimization for type parameters annotated out
. (Yes, due to a recent change I'm now allowed to annotate a type parameter of a method in
or out
!) In this example, since Consumer<T>
is contravariant in T
the optimization would not occur. OTOH, in the following code, the optimization would be allowed, and would be sound:
class Producer<out T>(T[] seq) {}
Object[] words = ["hello", "world"];
Producer<Object> = Producer(words); //ok
@RossTate WDYT?
By the way, the more I think on this issue, the less cases I can think of where reification is really likely to have a very negative impact on performance.
Tuple
and Entry
, AFAIK. (We know we definitely need to optimize these two.) Any generic collection type generally has much more internal state than just one or two type arguments.map()
or sort()
) do something expensive like iterating a collection, or are subsequently used when iterating a collection (e.g. byIncreasing()
or plus()
), so it's hard to see how passing a type argument is likely to be very costly. But there are exceptions like curry()
and compose()
.Indeed the main case I'm concerned about (except for the abovementioned Tuple
and Entry
) is when a concrete non-generic type inherits a member of a generic type. For example, String
inherits map()
. We need to be sure that in cases like this we don't wind up reifying Element
to Character
every time map()
is called. @FroMage what is your feeling on this issue? Is it a major problem?
Finally, how do reified type impact comprehensions? Anything to worry about there?
Gavin, your example is not unsound. You're confusing static types and exact types. Reification does not mean these all line up. Reification just means you have a way of determining the type arguments used to allocate an instance of a generic class (and the derived type arguments for inherited classes and interfaces) and the type arguments used to invoke a generic method.
From what I understand, there are two main expenses for reification: the space needed per instance of a generic class to store the type arguments (only a problem for certain backends such as Java and JS but not C#) and the allocation done at run time to build the data structures representing type arguments (problematic for all backends).
Regarding your concerns with map
, it should be using the reified information inside this
to determine that type argument at run time.
The only concern I can see for comprehensions is that they use inference and we need to make sure inference doesn't affect the semantics; an issue we have mentioned elsewhere.
Gavin, your example is not unsound.
Ross, it's unsound if I do the proposed optimization on it.
You're confusing static types and exact types.
Nope, it's more the optimization that "confuses" them ;-)
Sorry, I misunderstood what you were saying and consequently what you were misunderstanding, heheh. Here's my new take.
Note that the following still type checks:
class Consumer<in T>(T[] seq) {}
String[] words = ["hello", "world"];
Consumer<Object> = Consumer(words);
Since Consumer
is contravariant, Consumer(words)
will choose the least precise type argument possible given the constraints. In this case the only constraint is String <: T
, so it will infer Anything
for T
. This is good, cuz Consumer<Anything>
is the principal type of Consumer(words)
, and in particular can be assigned to Consumer<Object>
due to contravariance and Object <: Anything
.
@RossTate In the following code:
class Consumer<in T>(T[] t) {}
String[] strings = ["hello", "world"];
value consumer = Consumer(strings); //we get a Consumer<String>
The inferred type of consumer
is Consumer<String>
, not Consumer<Object>
. It's kind of a silly example, since contravariant types don't usually accept covariant types in their constructors. So to see that this is the correct behavior, consider this example:
class Consumer<in T>(T[] t) {}
DelegateConsumer<String> delegate = DelegateConsumer<String>();
value consumer = Consumer(delegate); //we get a Consumer<String>
I'm only bringing this up to demonstrate that the type arg inference algorithm does the right thing here, because I do not want this thread to go off on a tangent about type arg inference and contravariant types. In M5 the algorithm does the right thing and I'm completely satisfied with it.
All that has nothing to do with what we're discussing here, which is a different example:
class Consumer<in T>(T[] t) {}
Object[] strings = ["hello", "world"]; //NOTE: static type Object[], runtime type String[]
value consumer = Consumer(strings);
Here, static analysis infers the type Consumer<Object>
, but if we were to optimize away the passing of Object
to T
, and try to infer the type argument T
at runtime, based on the runtime type of t
, we would actually get a Consumer<String>
, which is not a Consumer<Object>
!
So obviously we wouldn't do this optimization in this case., because it would result in unsoundness. What we need to understand is in which cases is it ok to optimize away the static argument to a type parameter, and infer it at runtime from the runtime type of an argument?
For example, I believe it is sound to do it for tuples, that is, if I have the following:
Object x = "hello";
Object y = 1;
value tup = [x,y];
Then the static type of tup
is [Object,Object]
, but its runtime type would be [String,Integer]
, if we make this optimization.
But I want to go further than that and do the same sort of optimization here, for example:
function prepend<out Head,out Tail>(Head head, Tail tail)
given Tail satisfies Object[]
=> Tuple(head, tail);
I want to optimize away Head
and Tail
, and "infer" them at runtime from the runtimes types of head
and tail
. So if I do:
Object head = "hello";
[Object,Object] tail = [1, 1.0];
value result = prepend(head, tail);
I will get a result
tuple with static type [Object,Object,Object]
but runtime type [String,Integer,Float]
.
It's only OK to do this because the function prepend()
is covariant in its parameters Head
and Tail
. We can't do this optimization for contravariant types of functions because it results in unsoundness as above. But is that the only case which would be unsound?
After chatting with Ross, instead of talking past each other, as above, it's now clear what the cause of the unsoundness is.
Ross points out that we should not:
at least not at runtime, since that is the cause of the unsoundness above.
Arguably - indeed, Ross argues this, though I'm not completely sold - we should not even do that at compile time.
Anyway, this completely explains what's going on here, and gives us what we need to be able to solve the problem.
We need to be sure that in cases like this we don't wind up reifying Element to Character every time map() is called. @FroMage what is your feeling on this issue? Is it a major problem?
We don't, because we optimise the type descriptor for Ceylon non-parameterised types into static final
constants, in this case Character.$TypeDescriptor
, but map
takes another type param Result
which may or may not be instantiated by the caller, depending on whether it's constant or not.
@FroMage Right, of course.
Let's delay full consideration of this issue until 1.1, which is when we're going to focus on performance.
There are three big problems to overcome in our implementation of fully-reified types:
I think we're going to need some way to address this via either optimized or partial reification of type arguments. Let me sketch out the options here:
Partial reification:
We could annotate a type parameter as
static
like this:Then there would be limitations on what you can do with
T
. For example, you would not be able to use it withis
, like this:Nor would you be allowed to pass it as an argument to a reified type parameter:
Of course, all type parameters of Java classes would be considered
static
.The problem with this approach is that I think it would just get in the way much too much. For example, if tuple types were unreified, you would not be able to put instances of parameterized Java classes in them. It's hard to be able to guess in advance just how much pain this approach would cause, because we simply don't have enough experience with this stuff.
"Optimized" reification
In this approach, we let you provide an optimized implementation of the operation which obtains the reified type of an instance.
What we would probably do is say that for every subclass of Object which does not define an attribute named
type
, an attribute with the following signature is generated for it:class C()
, the attributeshared actual Class<C> type
class C<T>()
, the attributeshared actual Class<C<T>> type
class C<U,V>()
, the attributeshared actual Class<C<U,V>> type
The implementation of the generated attribute would include generated fields holding the reified type arguments.
OTOH, a class would be free to define its own
type
attribute. For example:Thus, the instance is responsible for computing its own reified type, and is free to make use of its internal state to do that.
Unfortunately this solution doesn't address the issue of Java classes, which don't have
getType()
methods. I guess we would be left with some nasty runtime exception or some shit.