github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.55k stars 1.5k forks source link

Build Error with C# Unity3D Project #5753

Closed ekolve closed 3 years ago

ekolve commented 3 years ago

I get the following error when lgtm.com attempts to build the C# code in this project: https://github.com/allenai/ai2thor

The C# code resides under the path unity/ and it can only be built with the Unity Editor, so I don't believe its an option to get the build to succeed. Is it possible to still receive analysis of C# project if it can't be built?

build.log

hmakholm commented 3 years ago

(Note for future readers: the C# autobuilder complains that it cannot auto-detect a suitable build method).

hmakholm commented 3 years ago

Unfortunately it looks like you're out of luck, unless you can find a way to compile your .cs files with a command-line invocation that CodeQL can snoop on.

I'm assuming that Unity provides some class libraries that your source compiles against, such that it won't work simply to write (or generate) a script that invokes csc on the input files.

The underlying problem is that CodeQL needs to know at least the signatures of any libraries you use, such that it can at least typecheck your C# source; otherwise analysis will be impossible. So if those libraries will only be provided by a tool that cannot run unattended as part of a build process, there's not really a way forward.

ekolve commented 3 years ago

I am able to get a SemanticModel of the project using Roslyn with code that resembles the following:

https://github.com/neuecc/MessagePack-CSharp/blob/b652df89bae5e707b7a5184e5464cd2fb2a2d06c/src/MessagePack.Generator/MessagepackCompiler.cs#L86

This has a few dependencies from nuget, but it should be possible to run on Linux. This would require writing C# code that would then traverse the semantic model to extract method calls.

hvitved commented 3 years ago

We do, in fact, use Roslyn for our extractor as well. The issue is that in order to get an accurate CodeQL database, one needs to perform a build of the codebase. During the build, we will detect all calls to the C# compiler (csc.exe/dll), and populate the database with only those files/assemblies (including e.g. generated files) passed to compiler calls.

However, from what I can tell, it sounds like building Unity projects requires proprietary tooling that we do not have access to on the LGTM workers. Therefore, you could try out our buildless extraction mode instead, which simply creates a database by simulating one giant call to the C# compiler with all .cs files in the repo. But do note that buildless mode can be much less accurate.

ekolve commented 3 years ago

Thank you! buildless extraction is working for me.