CodeQL execution is very slow

dbrezhniev commented 1 month ago

Hi! We've recently adopted CodeQL into our system and noticed very slow analysis for one of our codebases, which consists of java + kotlin. For comparison:

regular build takes 20-30 minutes
codeql analysis with autobuild mode takes 4 hours on average.

To be frank, our codebase is quite large, but I didn't expect this action to take 8x longer than the build itself. Can it be sped up somehow? Let me know if you need more info.

Workflow file for reference:

name: "CodeQL"
on:
...
jobs:
...
  analyze-java:
    name: Analyze java-kotlin
    container:
      image: XXXX
      credentials:
        username: XXXX
        password: XXXX
    steps:
    - name: Checkout repository
      uses: actions/checkout@v4
    - name: Initialize CodeQL
      uses: github/codeql-action/init@v3
      with:
        languages: java-kotlin
        build-mode: autobuild
    - name: Perform CodeQL Analysis
      uses: github/codeql-action/analyze@v3
      with:
        category: "/language:java-kotlin"

aeisenberg commented 1 month ago

Yes. We have recently rolled out buildless analysis for Java. It is likely that your analysis will run significantly faster with this enabled.

There are caveats, though. For example, if your project does lot of code generation then results will be worse (since we can't analyze the generated code). If buildless isn't the right choice for you, I would recommend using a larger runner.

aibaars commented 1 month ago

@dbrezhniev Apart from the normal build, CodeQL "compiles" all sources files into a special purpose database, and runs queries against that database. As a result the code is "compiled" twice during a CodeQL run with autobuild. In addition running the queries takes time too, and overall a CodeQL run typically takes about 3 times longer as a normal build. A 4 hour analysis time is indeed much higher than what I would have expected.

It would be good to know which phase of the CodeQL run is taking so long. Is it the "autobuild" step, the "database import/finalize" step, or the "analyze/query run" step? If the "autobuild" step is taking very long then @aeisenberg 's suggestion would be a good thing to try. If the other steps are slow then most likely increasing CPU and RAM could help. If you are using GitHub Actions then switching to a large runner should do the trick. Otherwise, you can also try setting the CODEQL_RAM and CODEQL_THREADS environment variables or the --ram, --threads CLI flags. It can also be that only a few queries in the "analyze" step are slow, but others run fast. In this case let us know, perhaps we can work with you to improve the performance of those queries.

dbrezhniev commented 1 month ago

Thanks for the fast response! Running codeql in build mode none does indeed speed up the execution, but as the majority of our code is in Kotlin, this method does not suit us. Difference in summary of none and autobuild modes: None: CodeQL scanned 629 out of 630 Java files in this invocation. Autobuild: CodeQL scanned 4727 out of 4875 Kotlin files and 629 out of 630 Java files in this invocation.

If buildless isn't the right choice for you, I would recommend using a larger runner.

This is something we tried already, but it didn't help. There is no time reduction with a larger size. In an attempt to debug it, we noticed that the runner is underutilized: Only 1 of 8 cores are used and more than half of memory is free. What is also interesting is that 3 of 4 hours are spent on a single сompileKotlin task during the analysis step:

  [2024-07-19 06:25:02] [build-stdout] [2024-07-19 06:25:02] [autobuild] > Task :aa:aa:testClasses UP-TO-DATE
  [2024-07-19 09:34:14] [build-stdout] [2024-07-19 09:34:14] [autobuild] > Task :bb:bb:compileKotlin

Let me know if you need more info!

aibaars commented 1 month ago

You're right that buildmode: none is not good for your case since it does not support Kotlin at the moment and your code base is mostly Kotlin.

What is also interesting is that 3 of 4 hours are spent on a single kompileKotlin task during the analysis step:

That is very interesting indeed. I'll ask the Kotlin team to have a look, there may be a performance issue in CodeQL's extractor for Kotlin code. Would you be able make a performance profile of the kompileKotlin task? I don't have much experience with Java profilers so I can't give you much help with that. Another thing would be some simple stack trace dumps using jstack to get an idea about what the kompileKotlin task is doing/stuck at.

corneliusroemer commented 4 days ago

Codeql failed completely due to OOM until I used this build command with extra JVM memory settings (don't ask me which of the 2 memory settings work, I sprinkled out of desperation but this eventually worked):

export JAVA_OPTS="-Xmx4096m"
./gradlew --no-daemon -Dorg.gradle.jvmargs=-Xmx1g build --info --stacktrace -x ktlintCheck -x test

https://github.com/loculus-project/loculus/pull/2705

github / codeql-action

CodeQL execution is very slow #2378