Open futureviperowner opened 1 year ago
Hi
Thanks for the report. Does it reproduce if you re-run it with debug logs enabled?
I have re-run the job numerous times today while debug logs are enabled without success in reproducing. I've enabled debug logs in the repo for now and will monitor for similar failures in the future.
I ran into this again today, however it was while a java analysis was being done. Debug logging was enabled for this execution. I did not see anything useful in relation to why execution hung until the timeout was reached. It logged that the zip file was being created and then nothing else was logged until the eventual timeout 9 minutes later.
2023-03-27T18:20:56.2764900Z [command]/__t/CodeQL/2.12.5-20230317/x64/codeql/codeql database bundle /__w/_temp/codeql_databases/java --output=/__w/_temp/codeql_databases/java.zip --name=java
2023-03-27T18:20:57.2633395Z Creating bundle metadata for /__w/_temp/codeql_databases/java...
2023-03-27T18:20:57.4642192Z Creating zip file at /__w/_temp/codeql_databases/java.zip.
2023-03-27T18:29:55.4573827Z ##[debug]CODEQL_ACTION_VERSION='2.2.9'
2023-03-27T18:29:55.4574254Z ##[debug]CODEQL_ACTION_FEATURE_SARIF_COMBINE='true'
2023-03-27T18:29:55.4574646Z ##[debug]CODEQL_ACTION_FEATURE_WILL_UPLOAD='true'
2023-03-27T18:29:55.4575047Z ##[debug]CODEQL_ACTION_FEATURE_MULTI_LANGUAGE='false'
2023-03-27T18:29:55.4575426Z ##[debug]CODEQL_ACTION_FEATURE_SANDWICH='false'
2023-03-27T18:29:55.4575891Z ##[debug]CODEQL_UPLOAD_SARIF__LANGUAGE_JAVA__CODEQL='CODEQL_UPLOAD_SARIF__LANGUAGE_JAVA__CODEQL'
2023-03-27T18:29:55.4577417Z ##[debug]Set output db-locations = {"java":"/__w/_temp/codeql_databases/java"}
2023-03-27T18:29:55.4577883Z ##[debug]Set output sarif-id = 1da297f2-cccc-11ed-9d41-2afcb5d1b5c6
2023-03-27T18:29:55.4588334Z ##[error]The action has timed out.
There's still not much to go on here. It looks like the results were uploaded successfully in this latest run, but zipping of the docker container failed due to the timeout. Is the problem repeatable? Do you know if this is a large database? If this is not happening on all runs, how long does it take to zip the database normally?
It's likely that if you're running a docker container inside of a standard GitHub runner, there's not much resources left to actually do the work. Maybe it always takes a long time to zip, but most of the time you're just under the threshold for the timeout.
The issue seems to be random when it occurs. The re-run of the job yesterday succeeded with the CodeQL analysis step taking 1m 2s. The overall job execution was 3m 9s. Is there a way to find the size of the database? All I can find in the logs is the size of the results upload (which is under 2 MB unzipped). Logs indicate that the database contains 24,707 lines of Java code. Timestamps on a successful ZIP file creation and upload took 3 seconds:
Mon, 27 Mar 2023 21:36:08 GMT ::group::Uploading results
Mon, 27 Mar 2023 21:36:08 GMT Uploading results
Mon, 27 Mar 2023 21:36:08 GMT Processing sarif files: ["/__w/exec-wms-platform-beacon-integrator/results/java.sarif"]
Mon, 27 Mar 2023 21:36:09 GMT ##[debug]Raw upload size: 1721090 bytes
Mon, 27 Mar 2023 21:36:09 GMT ##[debug]Base64 zipped upload size: 196496 bytes
Mon, 27 Mar 2023 21:36:09 GMT ##[debug]Number of results in upload: 3
Mon, 27 Mar 2023 21:36:09 GMT Uploading results
Mon, 27 Mar 2023 21:36:09 GMT ##[debug]response status: 202
Mon, 27 Mar 2023 21:36:09 GMT Successfully uploaded results
Mon, 27 Mar 2023 21:36:09 GMT ::endgroup::
Mon, 27 Mar 2023 21:36:09 GMT /__t/CodeQL/2.12.5-20230317/x64/codeql/codeql database bundle /__w/_temp/codeql_databases/java --output=/__w/_temp/codeql_databases/java.zip --name=java
Mon, 27 Mar 2023 21:36:10 GMT Creating bundle metadata for /__w/_temp/codeql_databases/java...
Mon, 27 Mar 2023 21:36:10 GMT Creating zip file at /__w/_temp/codeql_databases/java.zip.
Mon, 27 Mar 2023 21:36:13 GMT ##[debug]Successfully uploaded database for java
Mon, 27 Mar 2023 21:36:13 GMT ::group::Waiting for processing to finish
Mon, 27 Mar 2023 21:36:13 GMT Waiting for processing to finish
Mon, 27 Mar 2023 21:36:13 GMT Analysis upload status is complete.
Mon, 27 Mar 2023 21:36:13 GMT ::endgroup::
It is running on a standard GitHub runner.
If you run in debug mode, then the database will be uploaded as an artifact at the end of the analysis job (from the log messages, it looks like you're already doing that?). You can then download it later. However, 24,707 lines of Java is not large, so I'd be surprised if the database is large. The fact that you are running the analysis in a docker container does add a layer of complexity. Is this a requirement for your java code? It might help to build without docker if you can.
Debug logging has been enabled in this repo since my initial report in hopes of catching more info around the timeout. This particular project contains both Java and C, which is why the build is being done inside of a docker container. The image is based on the deployment image with additional tooling added to support building the native code.
While I've only run into this timeout with this particular project, other members of my team have reported codeql timeouts in other projects that don't build inside of a docker container. However, I just found out about this today so I haven't been able to compare those failures to this one to see if it looks like the same issue.
You could try explicitly setting the timeout to a higher number. Based on the logs, however, it looks like zip time is either negligible or takes 5+ minutes, so perhaps something is getting stuck.
I'm not inclined to increase the job timeout as it doesn't appear as though there's a reasonable expectation that will resolve the issue (other than chew up more GH runner minutes).
Is there some additional logging that can be enabled to debug further? It looks like the action is executing a codeql binary at the time it gets hung up so I wasn't able to follow the code well enough to find this out on my own.
Yes, the action is calling codeql database bundle
which is a light wrapper around a command to zip the database directory. There's no additional logging available here.
I've asked the rest of the team for ideas. It is odd that a call to write to a zip file is causing a process to hang.
Some suggestions:
upload-database: false
to the analyze step. This will prevent the database from being zipped and uploaded.Thanks for sticking with me on this. 😃
Will skipping the database upload affect results or build checks in any way? I can't tell from the docs if this just affects the availability of a build artifact or if there's a functional impact.
Yes. I've attached a anonymized version.
codeql-scan-workflow.zip
No, I don't technically need to do this. However, I'm attempting to ensure that all builds, tests, and static analysis tools are using the same environment and build artifacts so that we don't introduce (or miss) issues that might be dependent on these things.
The GH runner downloads the image and prepares a fresh container each time it's executed. There is no state being persisted with each execution.
Thanks for sticking with me on this. 😃
No worries!
1. Will skipping the database upload affect results or build checks in any way? I can't tell from the docs if this just affects the availability of a build artifact or if there's a functional impact.
There is no functional impact. Database upload only happens to help you with some extra analysis later.
2. Yes. I've attached a anonymized version. [codeql-scan-workflow.zip](https://github.com/github/codeql-action/files/11094210/codeql-scan-workflow.zip)
Thanks. I see it.
I have to admit that I'm stumped about this. I am not sure why writing to a zip file may occasionally hang. There are some things I can think of:
Howver, I think your best bet for now is to avoid uploading the database.
I'll try disabling the database upload and see how it goes. There are no large binaries involved. I'm not sure what you meant by soft/hard links in this context. Are we talking filesystem links? If so, none of that is applicable to this project.
I have CodeQL analysis enabled on a project that performs scans of C code using the cpp configuration. Normally, the scans works just fine. However, we had a scheduled scan fail today due to a timeout while preparing the zip file of the analysis results. The scan was performed against a commit that was successfully scanned previously, so it shouldn't be caused by anything specific to changes in the source being scanned.
A log file of the scan is attached.
codeql-cpp-analysis-timeout.zip
Unfortunately, there isn't much to go on since debug logs aren't enabled. See line 1656 for the line that hangs before it hits the 1 hour timeout configured for the job. The previously successful scan of this commit completed in just under 10 minutes.
One thing of note is that this job executes within the context of a custom docker image configured on the workflow. However, I suspect this is fairly common when working with C code due to the platform dependencies when building and executing it.