detekt / detekt

Static code analysis for Kotlin
https://detekt.dev
Apache License 2.0
6.09k stars 755 forks source link

Run detekt's Gradle tasks with Gradle's Worker API #4128

Closed 3flex closed 1 year ago

3flex commented 2 years ago

Run Detekt task, DetektGenerateConfigTask and DetektCreateBaselineTask with Gradle's Worker API. This uses process isolation mode which will allow using the toolchains feature in future (#4120). Performance in general seems acceptable though I haven't done proper benchmarks.

Two issues:

  1. When used on this project with an included build I see very poor performance on the first run. I don't see this when including the build in another project. Looking forward to others trying this on their projects to see the performance impact.
  2. On subsequent runs when used on this project with an included build I often see a CNFE which I think is due to a Gradle bug (I don't believe Gradle is restarting the Worker daemons when classpath file contents change). It's intermittent and can be resolved either by running the task a few times and having it complete successfully, or restarting the Gradle daemon.

Neither of these issues will impact users, but contributors will face this. I see issue 2 as being more annoying. It's similar to https://github.com/detekt/detekt/issues/2957 but can happen when changes are made in any subproject, not just core modules.

Edit: I've raised issue 2 at https://github.com/gradle/gradle/issues/13678#issuecomment-927495654. I'll open a new issue in a few days if there's no reply.

codecov[bot] commented 2 years ago

Codecov Report

Merging #4128 (daad0f0) into main (2290ec8) will decrease coverage by 0.01%. The diff coverage is n/a.

@@             Coverage Diff              @@
##               main    #4128      +/-   ##
============================================
- Coverage     84.61%   84.60%   -0.01%     
- Complexity     3787     3790       +3     
============================================
  Files           546      546              
  Lines         12918    12918              
  Branches       2268     2268              
============================================
- Hits          10930    10929       -1     
+ Misses          862      861       -1     
- Partials       1126     1128       +2     
Impacted Files Coverage Δ
.../detekt/rules/naming/ConstructorParameterNaming.kt 88.88% <0.00%> (-5.56%) :arrow_down:
...turbosch/detekt/rules/style/ForbiddenAnnotation.kt 90.24% <0.00%> (+2.43%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

3flex commented 2 years ago

Java 8 on Windows kept failing as the build runner didn't have enough memory.

This is somewhat expected - Worker API starts more daemons than before, but moving the detekt classloader out of the Gradle daemon reduces the memory requirements for the Gradle daemon itself.

I've therefore removed the JVM args config that increased Gradle and TestKit memory to compensate which should keep the build within the runner's memory constraints (7GB total).

chao2zhang commented 2 years ago

With the current implementation in a multi-module project, we will have worker processes up to org.gradle.workers.max.

// Before
➜  kotlin-android-template git:(main) ✗ jps
86912 
89369 Jps
86365 
// After
➜  kotlin-android-template git:(main) ✗ jps
86912 
89472 KotlinCompileDaemon
89428 KotlinCompileDaemon
89415 GradleDaemon
89497 GradleWorkerMain
89499 GradleWorkerMain
89498 GradleWorkerMain
89533 Jps
86365 

It appears to me that these additional number of gradle workers processes will impact developers using detekt.

3flex commented 2 years ago

Is the concern about memory usage?

From https://docs.gradle.org/current/userguide/worker_api.html#creating_a_worker_daemon:

These worker daemon processes will persist across builds and can be reused during subsequent builds. If system resources get low, however, Gradle will stop any unused worker daemons.

So we rely on Gradle managing its resources appropriately.

BraisGabin commented 2 years ago

I runned this scenarios using gradle-profiler gradle-profiler --benchmark --scenario-file ../detekt.scenarios:

detekt-parallel {
  tasks = ["detektMain", "detektTest"]
  gradle-args = ["--no-build-cache", "--parallel"]
  cleanup-tasks = ["cleanDetekt"]
}

detekt-no-parallel {
  tasks = ["detektMain", "detektTest"]
  gradle-args = ["--no-build-cache", "--no-parallel"]
  cleanup-tasks = ["cleanDetekt"]
}

plain-detekt-parallel {
  tasks = ["detekt"]
  gradle-args = ["--no-build-cache", "--parallel"]
  cleanup-tasks = ["cleanDetekt"]
}

plain-detekt-no-parallel {
  tasks = ["detekt"]
  gradle-args = ["--no-build-cache", "--no-parallel"]
  cleanup-tasks = ["cleanDetekt"]
}

I needed to add this task to invalidate the output of all the detekt tasks and avoid the UP-TO-DATE.

task<Exec>("cleanDetekt") {
    commandLine("bash", "-c", "rm -r **/build/reports/detekt || true")
}

The raw data is here:

A summary:

detekt-no-parallel detekt-parallel plain-detekt-no-parallel plain-detekt-parallel
main 69161.3 37724.3 24227.2 12441.4
worker 74114.2 43063.4 24426 13669.8
comparation greater (7%) greater (14%) same greater (10%)

No idea why plain-detekt-no-parallel maintains the same but it seems that this change makes detekt slower.

This doesn't mean "no-merge". I just wanted to know the impact of this change and show it.

3flex commented 2 years ago

Thanks for doing some benchmarking!

it seems that this change makes detekt slower

As expected. The question is, is the impact acceptable, given we can now use toolchains, as well as reduce memory pressure in the Gradle daemon since work is now done in separate processes? It's a tradeoff. In my opinion yes, it's still worth it, but let's wait for further feedback before merging.

3flex commented 2 years ago

Given the small size of this diff, could it be possible to ship this as an opt-in (or an opt-out)?

That's reasonable. What do you think about a Gradle property called detekt.use.worker.api? This follows the convention for the same setting for kapt (kapt.use.worker.api).

It would be an opt in first, then can upgrade to opt out later, then potentially have it as the only execution mode in future.

cortinico commented 2 years ago

It would be an opt in first, then can upgrade to opt out later, then potentially have it as the only execution mode in future.

Agree on this approach 👍

eygraber commented 2 years ago

I've had...not the most fun experience with the worker API for things like this (as a consumer). Specifically Android Lint started using it, and each worker was allowed to grow to 6gb (which was the Xmx I specified for the daemon). So Gradle might clean them when memory is needed, but until then all of my RAM was allocated (which also slowed down the machine so that in a lot of cases Gradle couldn't run fast enough to actually kill the processes to free the memory).

I think it's a good idea to do this in theory, but in the real world there are a lot of implications.

3flex commented 2 years ago

Just wondering if you tried tweaking org.gradle.workers.max at all?

There are also options for tweaking max memory only for the detekt tasks run using worker API which should help mitigate.

A permanent option for opt in/out might be better than removing the existing setup wholesale.

eygraber commented 2 years ago

I did, but that affected parallelism overall.

Lint had options to limit the worker memory, which worked, but was trial and error to get right on a per project basis. They eventually switched to run in process by default.

github-actions[bot] commented 2 years ago

This PR is stale because it has been open 90 days with no activity. Please comment or this will be closed in 7 days.

3flex commented 2 years ago

I still plan to complete this.

3flex commented 1 year ago

Thoughts on merging, as an experimental feature? Perhaps for 1.22. It's behind a flag, so it won't impact anyone's build.

This PR does require a min Gradle version bump to 6.2, though Kotlin 1.7.0 requires min Gradle 6.7.1 now, so we might consider increasing to that version as the minimum anyway. Gradle 6.7 also introduces Java toolchain support which we can support once this PR is merged.

BraisGabin commented 1 year ago

Which is our current min gradle version? But, if this is behind a flag AND it only requires the bump if you active it I vote to merge it inside 1.21.0 to get feedback from the users.

If that's not the case I vote to move it to 1.22.0.

cortinico commented 1 year ago

Which is our current min gradle version?

It's 6.1. I'm fine with bumping the min Gradle version to anything on 6.x

3flex commented 1 year ago

Let's wait for 1.22.0 when the min Gradle version is bumped in #4964

BraisGabin commented 1 year ago

We can merge this now, right?

3flex commented 1 year ago

I'd still wait for #4964 but otherwise yes, this should be ready to go.

schalkms commented 1 year ago

4964 is done. Is this PR still planned to be merged?

I'd still wait for https://github.com/detekt/detekt/pull/4964 but otherwise yes, this should be ready to go.

3flex commented 1 year ago

I'll fix the merge conflict soon, then anyone is welcome to have a last look & merge :)

arturbosch commented 1 year ago

Hi, I was triggered by Gradle closing two linked issues regarding the worker api which I linked in my worker api experiment from 2020 https://github.com/detekt/detekt/pull/2896. Looking over the changes in this PR, I do not quite understand how the classloader caching problem is solved? Wouldn't the code create a new classloader on each detekt invocation - meaning 1-2 seconds overhead for each gradle module/worker thread?

3flex commented 1 year ago

I do not quite understand how the classloader caching problem is solved?

It's not.

Wouldn't the code create a new classloader on each detekt invocation

It should only do so if there's no available Gradle worker, and number of current workers < org.gradle.workers.max.

Workers API performance should fall somewhere between the current (fastest) setup, and the JavaExec (slowest) version. Workers can be reused, so yes, each worker suffers the initial performance hit as it's spun up, but that only happens for new instances and not ones that are reused.

The big advantage of workers is increased parallelism in the build, as tasks using workers can run in parallel within a Gradle project (so e.g. detektMain and detektTest can run in parallel within the same project). Tasks that don't use workers can only run serially within a project. This can speed up the build if the machine has enough resources available.

The other important note is that detekt's use of Worker API is opt in and behind a flag so it can be treated as an experiment for now so this will not change the existing behaviour or use of the existing classloader cache.

BraisGabin commented 1 year ago

I moved this PR to 1.23.0. It's a big change to introduce it after two RC

vercel[bot] commented 1 year ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
detekt 🔄 Building (Inspect) Jan 6, 2023 at 8:52AM (UTC)
3flex commented 1 year ago

LGTM?