eclipse-archived / ceylon

The Ceylon compiler, language module, and command line tools
http://ceylon-lang.org
Apache License 2.0
396 stars 62 forks source link

Possible optimisation of switch cases #4410

Open CeylonMigrationBot opened 9 years ago

CeylonMigrationBot commented 9 years ago

[@renatoathaydes] The following switch statement seems to trigger an unnecessary analysis of the reified type parameter of Hi.

shared class Hi<out Arg>(shared Arg greeting) {}
shared class Bye(shared String word) {}
shared class NonFinal(shared String word) {}

shared void run() {
    // make items contain many his and a few byes to
   // stop the JVM from optimising away stuff at runtime
    value items = [ for (i in 1..100k) if (10.divides(i))
                    then Bye(i.string) else Hi(NonFinal(i.string)) ];

    value startTime = system.milliseconds;

    variable Integer byes = 0;
    variable Integer his = 0;
    for (item in items) {
        switch(item)
        case (is Bye) {
            byes++;
        }
        case (is Hi<NonFinal>) {
            his++;
        }
    }

    value totalTime = system.milliseconds - startTime;
    print("Found ``his`` his and ``byes`` byes in ``totalTime`` ms.");

}

In my machine, this prints (3 runs):

Found 90000 his and 10000 byes in 526 ms.
Found 90000 his and 10000 byes in 447 ms.
Found 90000 his and 10000 byes in 466 ms.

We can optimise the switch statement trivially to run much faster:

switch(item)
case (is Bye) {
    byes++;
}
//case (is Hi<NonFinal>) {
else {
    his++;
}

This prints:

Found 90000 his and 10000 byes in 34 ms.
Found 90000 his and 10000 byes in 34 ms.
Found 90000 his and 10000 byes in 35 ms.

This seems to be a trivial optimisation opportunity for the compiler to make? Semantically, the code remains the same but the speed improvement is enormous.

[Migrated from ceylon/ceylon-spec#1304]

CeylonMigrationBot commented 9 years ago

[@jvasileff] See also #2054#issuecomment-73943371

Basically, statically determine if an instanceof check would be sufficient before emitting isReified.

CeylonMigrationBot commented 9 years ago

[@tombentley] This issue was motivated by this discussion

CeylonMigrationBot commented 9 years ago

[@jvasileff] Worth noting: every case where isReified can be eliminated or optimized to instanceof also improves interop w/generic Java classes where isReified usually throws.

jvasileff commented 8 years ago

I just finished some related work on the Dart backend that would help here. For Dart, the idea is to warn when reified tests are necessary but not available. For Java & JS, it's just the opposite, don't perform a full reified type check if not necessary.

Using the following definitions:

the steps are:

  1. Calculate RawIsType. This is a recursive calculation, accounting for qualifying types, erased types like Identifiable, and type parameters, and preserving all unions and intersections. Covariant type parameters are assigned the union of the parameter's upper bound constraints. Contravariant type parameters are assigned Nothing. Invariant type parameters are assigned the same as covariant parameters, but with use-site covariance.
  2. Calculate ExpectedType and OptimizedType per their definitions.
  3. Determine if OptimizedType <: ExpectedType. If so, implement the is test using only instanceof and logical operators.

Note: types involving captured type parameters of functions (i.e. any class or interface with a generic function in its ancestry) will likely need to be excluded from the optimization.

with the results:

After compiling a few sdk modules on the Dart & JVM backends and looking the cases where the Dart compiler did not warn for is tests but the JVM compiler did produce isReified calls, the following inefficiencies were found:

Location is test in current code suitable "raw" replacement
ceylon.collection::ArrayList.copyTo is Array<Element?> Array<out Anything>
ceylon.collection::ArrayList.copyTo is ArrayList<Element> ArrayList<out Anything>
ceylon.html::StyledElement is String[] Anything[]
ceylon.json.stream::StreamingVisitor is String->JsonValue Anything->Anything
ceylon.logging::Logger.ceylon is String() Anything(*Nothing)
ceylon.time::DateRange.gap is [Date,Date] [Anything,Anything*]
ceylon.time::DateTimeRange.gap is [DateTime,DateTime] [Anything,Anything*]
ceylon.time::TimeRange.gap is [Time,Time] [Anything,Anything*]

And of course, similar for the original example in the issue description since Hi<> and Bye are disjoint.

jvasileff commented 8 years ago

p.s. I think this issue/optimization should be tagged for the backends, not the typechecker.

RossTate commented 8 years ago

Your RawIsType is broken when bounds are recursively defined. But if y'all have adopted material-shape separation (I don't know what the status of that is), then you can just use the material bounds, which are defined to be non-recursive. Note that those bounds, too, might be refinable this way. I think that should still terminate though. Regardless, in my survey I found that these are rarely used, at least in Java, so you may just want to skip this process entirely.

Lastly, you can simplify this check to just TestedType & RawIsType <: IsType.

Note that if that check fails, you can recursively do a similar process for the problematic type arguments. This way you're only checking the type arguments that actually matter.

jvasileff commented 8 years ago

Your RawIsType is broken when bounds are recursively defined

Yes, thanks, I should have made a note of that. In fact, I had hacked in an Anything and a TODO in my code.

Regardless, in my survey I found that these are rarely used

@RossTate I though you might be interested, I added a println to search for this in the language and sdk modules I'm compiling, and found 9 cases. They are all doing the same thing.

Enumerable has a self type, and a Range is an efficient range of Enumerables. Ranges are compared by making sure they are the same type (the 9 cases), and then comparing their endpoints or offsets.

Lastly, you can simplify this check to just TestedType & RawIsType <: IsType

Ah, yes, that's better. Thanks!

jvasileff commented 8 years ago

I should have noted, this doesn't apply to types that might capture a type parameter that is not represented by a qualifying type.

For example, class C:

void fun<T>() {
    class C() {}
}

or even class E:

class D<T>() {
    void fun() {
        class E() {}
    }
}
sadmac7000 commented 8 years ago

I would also add that an assert(is...) should always just use isinstance if it has to.