scala / scala-dev

Scala 2 team issues. Not for user-facing bugs or directly actionable user-facing improvements. For build/test/infra and for longer-term planning and idea tracking. Our bug tracker is at https://github.com/scala/bug/issues
Apache License 2.0
130 stars 14 forks source link

Restore 2.11 style desugaring of non-top-level module accessor and module lzycompute. #225

Closed retronym closed 8 years ago

retronym commented 8 years ago

My theory is that the shapeless slowdown is because of the change to inline the synchronized { if (moduleVar eq null) moduleVar = new NestedModule }; moduleVar into the module accessor method, rather than before when this was in a separate method.

Concretely, trait lazy vals are mixed into subclasses with the needed synchronization logic in place, as do lazy vals in classes and methods. Similarly, modules are initialized using double checked locking. Since the code to initialize a module is short, we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

2.11.8:

scala> :javap -c -private scala.reflect.internal.SymbolTable#NoPrefix
  public scala.reflect.internal.Types$NoPrefix$ NoPrefix();
    Code:
       0: aload_0
       1: getfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
       4: ifnonnull     14
       7: aload_0
       8: invokespecial #3100               // Method NoPrefix$lzycompute:()Lscala/reflect/internal/Types$NoPrefix$;
      11: goto          18
      14: aload_0
      15: getfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      18: areturn
  public scala.reflect.api.Types$TypeApi NoPrefix();
    Code:
       0: aload_0
       1: invokevirtual #5560               // Method NoPrefix:()Lscala/reflect/internal/Types$NoPrefix$;
       4: areturn

scala> :javap -c -private scala.reflect.internal.SymbolTable#NoPrefix$lzycompute
  private scala.reflect.internal.Types$NoPrefix$ NoPrefix$lzycompute();
    Code:
       0: aload_0
       1: dup
       2: astore_1
       3: monitorenter
       4: aload_0
       5: getfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
       8: ifnonnull     23
      11: aload_0
      12: new           #3096               // class scala/reflect/internal/Types$NoPrefix$
      15: dup
      16: aload_0
      17: invokespecial #3097               // Method scala/reflect/internal/Types$NoPrefix$."<init>":(Lscala/reflect/internal/SymbolTable;)V
      20: putfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      23: getstatic     #737                // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
      26: pop
      27: aload_0
      28: monitorexit
      29: aload_0
      30: getfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      33: areturn
      34: aload_1
      35: monitorexit
      36: athrow
    Exception table:
       from    to  target type
           4    29    34   any

2.12.0-RC1:

scala> :javap -c -private scala.reflect.internal.SymbolTable#NoPrefix
  public scala.reflect.internal.Types$NoPrefix$ NoPrefix();
    Code:
       0: aload_0
       1: getfield      #2615               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
       4: ifnonnull     36
       7: aload_0
       8: monitorenter
       9: aload_0
      10: getfield      #2615               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      13: ifnonnull     28
      16: aload_0
      17: new           #867                // class scala/reflect/internal/Types$NoPrefix$
      20: dup
      21: aload_0
      22: invokespecial #2616               // Method scala/reflect/internal/Types$NoPrefix$."<init>":(Lscala/reflect/internal/SymbolTable;)V
      25: putfield      #2615               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      28: aload_0
      29: monitorexit
      30: goto          36
      33: aload_0
      34: monitorexit
      35: athrow
      36: aload_0
      37: getfield      #2615               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      40: areturn

This has probably interfered with HotSpot's inlining.

image

I was comparing 2.11.8 and 2.12.0-RC1++ performance on the shapeless test suite, file by file. This was the worst regression, probably due to the relatively high amount of subtype checks that are performed in the implicit search.

⚡ time ((scalac -J-XX:+UnlockCommercialFeatures -J-XX:+UnlockDiagnosticVMOptions -J-XX:+DebugNonSafepoints -J-XX:+FlightRecorder -J-XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,stackdepth=1024,loglevel=debug,settings=profile,dumponexitpath=/tmp/old.jfr -Xlog-implicits @args-small.txt /Users/jz/code/shapeless/core/src/test/scala/shapeless/hlist.scala 2>&1) > /tmp/hlist-old.log)

real    0m27.659s
user    1m33.629s
sys 0m2.012s

⚡ time ((/code/scala/build/pack/bin/scalac -J-XX:+UnlockCommercialFeatures -J-XX:+UnlockDiagnosticVMOptions -J-XX:+DebugNonSafepoints -J-XX:+FlightRecorder -J-XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,stackdepth=1024,loglevel=debug,settings=profile,dumponexitpath=/tmp/new.jfr -Xlog-implicits @args-2.12.0-RC-small.txt /Users/jz/code/shapeless/core/src/test/scala/shapeless/hlist.scala 2>&1) > /tmp/hlist-new.log)

real    1m27.506s
user    2m37.283s
sys 0m2.292s

The actual output of -Xlog-implicits was quite similar, so it seems a similar amount of work was being done.

retronym commented 8 years ago

/cc @milessabin

retronym commented 8 years ago

Here's a quick attempt to restore the lzycompute method for module init:

https://github.com/retronym/scala/tree/ticket/SD-225

Bootstrapping the compiler through this change scala-2.12.0-2f20ab6-SNAPSHOT.zip, we get much closer to the 2.11.8 cold performance for hlist.scala.

⚡ (cd /code/shapeless; time ((/code/scala/build/pack/bin/scalac -J-XX:+UnlockCommercialFeatures -J-XX:+UnlockDiagnosticVMOptions -J-XX:+DebugNonSafepoints -J-XX:+FlightRecorder -J-XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,stackdepth=1024,loglevel=debug,settings=profile,dumponexitpath=/tmp/newer.jfr -Xlog-implicits @args-2.12.0-RC-small.txt /Users/jz/code/shapeless/core/src/test/scala/shapeless/hlist.scala 2>&1) > /tmp/hlist-newer.log))

real    0m29.768s
user    1m35.219s
sys 0m1.946s
retronym commented 8 years ago

TODO

adriaanm commented 8 years ago

Wow! Excellent detective work. Was not expecting that one.. Will be more careful and benchmark this kind of stuff next time.

SethTisue commented 8 years ago

over at https://issues.scala-lang.org/browse/SI-9917 @cheeseng is reporting that between M5 and RC1, the time to compile ScalaTest increased more than 2x. so that's another codebase to test this fix on

cheeseng commented 8 years ago

@adriaanm @retronym @SethTisue @bvenners just want to report the latest finding, using Scala 2.12.0-RC2, it seems the performance issue is still there, here's the result of sbt test:compile using our 3.1.x branch:

2.11.8 - 602 seconds 2.12.0-RC2 - 1520 seconds

The difference probably is too much for us to swallow. :)

adriaanm commented 8 years ago

What happens when you compile under -Xsource:2.11? I'm thinking it may be val signature inference. On Wed, Oct 19, 2016 at 02:27 Chua Chee Seng notifications@github.com wrote:

@adriaanm https://github.com/adriaanm @retronym https://github.com/retronym @SethTisue https://github.com/SethTisue @bvenners https://github.com/bvenners just want to report the latest finding, using Scala 2.12.0-RC2, it seems the performance issue is still there, here's the result of sbt test:compile using our 3.1.x branch:

2.11.8 - 602 seconds 2.12.0-RC2 - 1520 seconds

The difference probably is too much for us to swallow. :)

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/scala/scala-dev/issues/225#issuecomment-254761474, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFjy4CalHQhGNoM86vMGyCtzaq1GuZyks5q1eKCgaJpZM4J7mxb .

adriaanm commented 8 years ago

In any case, I'm investigating this today! Opened a new ticket: https://github.com/scala/scala-dev/issues/246