github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.69k stars 1.54k forks source link

CodeQL emits error when binlog does not contain compilation #17981

Open DmitriyShepelev opened 2 days ago

DmitriyShepelev commented 2 days ago

Description of the issue An error is emitted when creating a database with the -Obinlog option using a binlog file that does not contain any compilation.

Repro steps:

  1. Download and extract codeql-binlog-min-repro.zip (contains two files: global.json and NoTargetsProject.csproj) to some directory.
  2. In said directory, run msbuild.exe /t:restore;build /bl.
  3. Run codeql.exe database create <database-directory> --language=csharp --build-mode=none -Obinlog=msbuild.binlog
  4. Observe the following error (note that I've generalized some personal file paths):
    Initializing database at C:\Users\DSHEPE~1\AppData\Local\Temp\CodeQL_CB_DB2.
    Running build command: []
    Running command in D:\repos\test: [<path-to-codeql>\codeql\csharp\tools\autobuild.cmd]
    [2024-11-13 12:00:24] [build-stdout] CodeQL C# autobuilder
    [2024-11-13 12:00:24] [build-stdout] Working directory: D:\repos\test
    [2024-11-13 12:00:25] [ERROR] Spawned process exited abnormally (code 2; tried to run: [<path-to-codeql>codeql\tools\win64\runner.exe, cmd.exe, /C, type, NUL, &&, <path-to-codeql>\codeql\csharp\tools\autobuild.cmd])
    A fatal error occurred: Exit status 2 from command: [<path-to-codeql>\codeql\tools\win64\runner.exe, cmd.exe, /C, type, NUL, &&, <path-to-codeql>\codeql\csharp\tools\autobuild.cmd]

Can this scenario be treated as a no-op, with just a message emitted that no compilations were found in the binlog?

cc: @tamasvajk

tamasvajk commented 1 day ago

The other modes (traced, build-mode: none) of the extraction also fail if no source code was seen during the extraction, so I'm somewhat hesitant to do this. What is the reason that you'd like this to succeed and to get an empty database?

DmitriyShepelev commented 1 day ago

I'm using the -Obinlog option to automate CodeQL database creation for arbitrary csproj's, the builds of which aren't guaranteed to invoke Csc.exe. Instead of me having to add logic to load and parse the binlog file myself to verify that Csc.exe was in fact invoked and, if it is, have codeql load and parse the binlog file again, a much better approach would be to have codeql just create an empty database and (maybe) emit an appropriate message.

tamasvajk commented 18 hours ago

@DmitriyShepelev I checked what happens if we treat empty binlog extraction a success in the extractor: https://github.com/github/codeql/pull/17992. In this case, the extractor is returning a success code, but the parent process, the CodeQL CLI does some extra validation on the produced DB, and finds that the DB is empty, which is not considered a success overall. So from your point of view not much changes: the exit code is now 32 instead of the previous 2.

Could you tell me more about your use-case? I have the feeling that you anyways need to handle error cases.

DmitriyShepelev commented 11 hours ago

@DmitriyShepelev I checked what happens if we treat empty binlog extraction a success in the extractor: #17992. In this case, the extractor is returning a success code, but the parent process, the CodeQL CLI does some extra validation on the produced DB, and finds that the DB is empty, which is not considered a success overall. So from your point of view not much changes: the exit code is now 32 instead of the previous 2.

Could you tell me more about your use-case? I have the feeling that you anyways need to handle error cases.

My use-case involves creating CodeQL database creation processes per csproj in a way oblivious to whether Csc is invoked and, at the very end, creating a separate CodeQL database creation process that imports said resultant databases' TRAP and source files to create a final, complete database. Separately, https://github.com/github/codeql/pull/17955#event-15260048974 would simplify this design.

tamasvajk commented 11 hours ago

I think https://github.com/github/codeql/pull/17992 would help you in combination with https://github.com/github/codeql/pull/17955. The latter allows you to pass all the binlog files into a single codeql database create, while the former would help in not failing immediately when the extractor is processing an empty binlog. So, you could create a binlog for each csproj, then pass all binlog files to a single codeql database create, and if there's at least one binlog that's not empty, then the DB creation would succeed. What do you think? Would this work for you?

DmitriyShepelev commented 11 hours ago

I think #17992 would help you in combination with #17955. The latter allows you to pass all the binlog files into a single codeql database create, while the former would help in not failing immediately when the extractor is processing an empty binlog. So, you could create a binlog for each csproj, then pass all binlog files to a single codeql database create, and if there's at least one binlog that's not empty, then the DB creation would succeed. What do you think? Would this work for you?

Yep, should work. Thank you very much for your help - I really appreciate it. :)