scala / scala3

The Scala 3 compiler, also known as Dotty.
https://dotty.epfl.ch
Apache License 2.0
5.83k stars 1.05k forks source link

~5x compilation slowdown compared to 2.13 on ~400 files #20217

Closed keynmol closed 1 week ago

keynmol commented 5 months ago

I have created this repository by removing most of the dependencies and plugins: https://github.com/keynmol/smithy4s-aws-only

It contains ~428 Scala files.

You can switch Scala version using environment variable:

$ SCALA_VERSION="2.13.13" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 28178 ms

$ SCALA_VERSION="3.3.3" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
 [info]   root / Compile / compileIncremental                           : 146483 ms

$ SCALA_VERSION="3.4.1" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 158857 ms

I'm observing consistent 5x slowdown compared to 2.13 - the code doesn't change (no codegenerators).

Minimising it further would be difficult, as this difference is most pronounced on this relatively large amount of files.

WojciechMazur commented 5 months ago

The slowdown seems to be similar to the one observed in https://github.com/scala/scala3/issues/20120, maybe these 2 are related

lrytz commented 5 months ago

Here are -Yprofile-trace files from 2 and 3 compilers (I did publishLocal on https://github.com/scala/scala3/pull/19897).

traces.zip

It seems about typing the implicit val schema (there are 500 of them), each takes around 240 ms on Scala 3 but only a 2 ms on Scala 2.

Gedochao commented 3 months ago

@noti0na1 let's check how bad it is at this point, since we did a number of improvements since this was raised (https://github.com/scala/scala3/pull/20399 and https://github.com/scala/scala3/pull/20523, and some others).

keynmol commented 3 months ago

I don't think it made a difference unfortunately:

~/p/k/smithy4s-aws-only (main) > SCALA_VERSION="2.13.13" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 14466 ms

~/p/k/smithy4s-aws-only (main) > SCALA_VERSION="3.3.3" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 65765 ms

~/p/k/smithy4s-aws-only (main) > SCALA_VERSION="3.5.0-RC2" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 63716 ms

~/p/k/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240703-75a15c2-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 64124 ms
noti0na1 commented 2 months ago

Due to the differences in the implicit search approach, the compile time in Scala 3 is expected to differ from Scala 2. If we could create a minimal example with one implicit definition that clearly takes longer to compile, we can then identify potential improvements.

CC @Gedochao

Baccata commented 2 months ago

The fact that the repro is large is deceptive : the code shared by @keynmol is a repetition of the exact same implicit pattern across a great many files/classes. Running a profiler (like yourkit) on the compilation of that project should allow for precisely pinpointing possible areas of improvement, and profiling the compiler would likely be more difficult on a minified repro that would contain a single occurrence of the implicit.

noti0na1 commented 2 months ago

The core issue is: during resolveOverloaded, the alts are widened in many places and multiple times. However, the alts are provisional TermRefs in this case, so the result of denot is not cached. When an alt has denotation union with a large number of alternatives, then computation is expensive and producing the same result over and over again.

I tried to cache some result of alt.widen inside resolveOverloaded (which is probably not the correct way), and the compiling time of the example project can be reduced to Scala 2 level.

cc @odersky

odersky commented 2 months ago

If it helps, maybe we should do what you tried: cache in resolveOverloaded.

keynmol commented 1 week ago

Should I see an improvement in 3.6.0-RC1-bin-20240922-22ed2fb-NIGHTLY (from yesterday?) given PR was merged 5 days ago.

~/p/i/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240922-22ed2fb-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 66566 ms

~/p/i/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240922-22ed2fb-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 62118 ms
noti0na1 commented 1 week ago

Hi @keynmol , can you test the nightly versions before and after #21584 again? I do observe some improvement on my laptop.

21278 would improve more than #21584, but it is too complicated and may introduce some overhead in other cases.

We will think about other way to improve in the future.

keynmol commented 1 week ago

I don't see much difference on my laptop:

~/p/i/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240922-22ed2fb-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 61573 ms
~/p/i/smithy4s-aws-only (main) > SCALA_VERSION="3.6.0-RC1-bin-20240919-ba9e59a-NIGHTLY" sbt -Dsbt.task.timings=true 'clean;compile' 2>&1 | grep "root / Compile / compileIncremental"
[info]   root / Compile / compileIncremental                           : 62749 ms

But here's something fascinating - on Github's ubuntu runners the difference between 2.13 and Scala 3 is "only" 2x!

https://github.com/keynmol/smithy4s-aws-only/actions/runs/10997520261

CleanShot 2024-09-23 at 16 33 27@2x

I have no idea why my Apple M2 laptop is affected by this so much...

soronpo commented 1 week ago

I have no idea why my Apple M2 laptop is affected by this so much...

Maybe the CI matrix should add at least M1 explicitly? image

I don't know if the self-hosted runners include M1 or M2.