bazelbuild / rules_kotlin

Bazel rules for Kotlin
Apache License 2.0
328 stars 205 forks source link

Cold start times for KotlinCompile/KotlinKapt workers #1179

Open smocherla-brex opened 3 weeks ago

smocherla-brex commented 3 weeks ago

I observed that cold start on the persistent workers can play a part in slow builds. An example with Kapt (using --define kt_timings=1)

   * kapt (org.mapstruct.ap.MappingProcessor, io.micronaut.annotation.processing.TypeElementVisitorProcessor, io.micronaut.annotation.processing.BeanDefinitionInjectProcessor, io.micronaut.annotation.processing.AggregatingTypeElementVisitorProcessor, io.micronaut.annotation.processing.ServiceDescriptionProcessor, io.micronaut.annotation.processing.PackageConfigurationInjectProcessor): 10294 ms
|   * compile classes: 65 ms
|     * kotlinc: 0 ms
|     * creating KAPT generated Java source jar: 4 ms
|     * creating KAPT generated Kotlin stubs jar: 15 ms
|     * creating KAPT generated stub class jar: 45 ms

and with a warm persistent worker

|   * kapt (org.mapstruct.ap.MappingProcessor, io.micronaut.annotation.processing.TypeElementVisitorProcessor, io.micronaut.annotation.processing.BeanDefinitionInjectProcessor, io.micronaut.annotation.processing.AggregatingTypeElementVisitorProcessor, io.micronaut.annotation.processing.ServiceDescriptionProcessor, io.micronaut.annotation.processing.PackageConfigurationInjectProcessor): 4751 ms
|   * compile classes: 47 ms
|     * kotlinc: 0 ms
|     * creating KAPT generated Java source jar: 0 ms
|     * creating KAPT generated Kotlin stubs jar: 10 ms
|     * creating KAPT generated stub class jar: 37 ms

Around half the time. Also an example with KotlinCompile

cold start

Task timings for //path/to/target:foo (total: 16958 ms):
|   * expand sources: 3 ms
|   * compile classes: 16940 ms
|     * kotlinc: 16831 ms

and warm start

Task timings for //path/to/target:foo (total: 8492 ms):
|   * expand sources: 1 ms
|   * compile classes: 8491 ms
|     * kotlinc: 8420 ms

I guess multiplex workers would help with this problem (If I remember it was enabled but reverted due to some corruption issues?). I was also wondering if using GraalVM native image binaries for these workers instead of raw Java binaries would help. Recently Bazel/rules_java has adopted it https://github.com/bazelbuild/rules_java/pull/151 with some benefits. We're looking to try with GraalVM binaries for the compiler workers but was wondering if that's a direction you'd be open to accepting (if it actually brings the benefits with cold start).

Bencodes commented 3 weeks ago

@smocherla-brex Bazel is able to persist your workers between builds negating the performance overhead of having to spin them up for each build or even each action. Are you by chance killing your workers after each build?

I guess multiplex workers would help with this problem (If I remember it was enabled but reverted due to some corruption issues?).

rules_kotlin still supports multiplex workers and you can enable them for your project to see if they work. They won't solve the slow startup times being reported here since Bazel is still having to spin up a worker, but it may reduce them some since Bazel isn't having to spin up as many workers.

Multiplex workers are gated by this flag: https://github.com/bazelbuild/rules_kotlin/blob/7dcb7f94f3f367110d75a3ea4464ae4e4cbbf8f0/kotlin/internal/toolchains.bzl#L215-L218

I was also wondering if using GraalVM native image binaries for these workers instead of raw Java binaries would help. Recently Bazel/rules_java has adopted it https://github.com/bazelbuild/rules_java/pull/151 with some benefits. We're looking to try with GraalVM binaries for the compiler workers but was wondering if that's a direction you'd be open to accepting (if it actually brings the benefits with cold start).

I haven't tried building rules_kotlin with GraalVM yet but we did test it out internally against some other internal rules. GraalVM isn't a free drop in replacement for Java and from what we found it does take some work to get things compiling with it. Getting the entire Kotlin compiler along with KSP and Kapt seems like a pretty challenging task.

There are also some other features that we'd like to test out like incremental Kotlin compilation (similar to Gradle) which is something we can only realistically do if we have a persistent worker that can define the shared disk cache for the Kotlin compiler.

We'd definitely be open to evaluating a GraalVM solution if you are able to get things compiling and executing.

smocherla-brex commented 3 weeks ago

Bazel is able to persist your workers between builds negating the performance overhead of having to spin them up for each build or even each action. Are you by chance killing your workers after each build?

We actually do have to enable this sadly in CI (with --worker_quit_after_build and --worker_max_instances) because of https://github.com/bazelbuild/bazel/issues/12165 and we've observed that we frequently run into OOMs because the workers use an enormous amount of memory. This has less of an affect in local development as we don't have --worker_quit_after_build there but the memory usage problem persists. We haven't yet upgraded to Bazel 7 which has some of the flags for worker GC management but if we do enable it, I would expect the cold start times to be more problematic there.

rules_kotlin still supports multiplex workers and you can enable them for your project to see if they work. They won't solve the slow startup times being reported here since Bazel is still having to spin up a worker, but it may reduce them some since Bazel isn't having to spin up as many workers.

Thanks for the pointer - somehow missed it, I will try this out and check.

I haven't tried building rules_kotlin with GraalVM yet but we did test it out internally against some other internal rules. GraalVM isn't a free drop in replacement for Java and from what we found it does take some work to get things compiling with it. Getting the entire Kotlin compiler along with KSP and Kapt seems like a pretty challenging task.

There are also some other features that we'd like to test out like incremental Kotlin compilation (similar to Gradle) which is something we can only realistically do if we have a persistent worker that can define the shared disk cache for the Kotlin compiler.

I was looking at the Kotlin builder code and see how we could plug in incremental compilation based on this, but it seemed a bigger effort managing a cache that Bazel is unaware of (mostly concerned about non-reproducibility issues we could run into, but I'm also not a kotlin expert :)). Good to know it's on the radar though.

We'd definitely be open to evaluating a GraalVM solution if you are able to get things compiling and executing.

Good to know, and also thanks for sharing your experience with rules_graalvm. I'll try out some experiments on our end and will follow-up with an update if I can get it compiling.

smocherla-brex commented 2 weeks ago

Even for local builds I do notice this without --worker_quit_after_build and after adding --worker_verbose, seems like Bazel's garbage collection of workers does result in workers being cleaned up during a build with many actions without us doing it ourselves.

INFO: Destroying KotlinCompile worker (id 11)
INFO: Destroying KotlinCompile worker (id 1)
INFO: Destroying KotlinKapt worker (id 6)
corbinrsmith commented 2 weeks ago

kotlinc has been an interesting performer with remote execution -- as a persistent worker it's slightly slower than javac.

But, well, the first compilation is slooooooow until the first gc pass. I suspect it leans very heavily on the JIT. We've found that too much ram can be as problematic as too little, and it's happiest with 2-3 cpus. I haven't had as much time as I'd like to try and tune it, of course. I suggest tuning the gc/allocations for the worker until you see improvement.

K2 will, of course, change the whole game.