github / codeql-action

Actions for running CodeQL analysis
MIT License
1.15k stars 319 forks source link

CodeQL PowerShell Support #2366

Open StartAutomating opened 3 months ago

StartAutomating commented 3 months ago

CodeQL does not currently support PowerShell. Given that PowerShell is quite a potent language that has been used to great effect by red team and blue team alike, this lack of functionality hurts both CodeQL and PowerShell.

I am deeply familiar with the PowerShell AST and would likely be able to make CodeQL PowerShell language support work, if the team can help provide the right guidance on integration.

Please provide more information about how one can write new CodeQL bindings, so that I might turn this issue into a more useful pull request.

dilanbhalla commented 3 months ago

Awesome to hear this! We have developed an open source PowerShell extractor actually, which allows PowerShell source code to be converted to a CodeQL database. The extractor can be found here: https://github.com/microsoft/codeql/releases. It doesn't matter if the release version is a bit behind, you can just grab the latest powershell.zip and unzip that folder next to your codeql executable. Once you do this, you should be able to run any commands related to extraction/db creation with "powershell" as the language.

What we really need now to enable analysis is the core CodeQL libraries for PowerShell. We have built out a bunch of these already (which we can open source as well), but core libraries related to the AST such as AST.qll, Cfg.qll, and Dataflow.qll need to be populated so that we can start building out qlls on top of them. Here is a repository that has been set up recently that shows how this is done for a simple/demo language, kaleidoscope: https://github.com/aibaars/codeql-kaleidoscope/tree/main (go to ql/lib/codeql/kaleidoscope). Some of these core libraries can probably be shared libraries now (and we can update the kaleidoscope repo to reflect that), @aibaars can explain further there.

StartAutomating commented 3 months ago

@dilanbhalla @aibaars Thanks for providing some context. Please provide a bit more :-)

Additionally, if I'm reading the kaleidoscope example correctly, what you're doing is synergistically aligned with a metaprogramming language I build, Pipescript. A major component of that language is AST manipulation. Another major component is an open-ended definition of languages. Here are a couple of items I believe we should also look at:

Please let me know what you think of these scenarios, and if you'd expect either of them to be "natively" handled by CodeQL in the near/mid future.

Forgive me if these are foolish questions; I'm a PowerShell expert, not a CodeQL expert.