mandiant / capa

The FLARE team's open-source tool to identify capabilities in executable files.
https://mandiant.github.io/capa/
Apache License 2.0
4.84k stars 557 forks source link

ghidra: Integrate the FLIRT matching engine into the Ghidra Feature Extractor #1981

Open colton-gabertan opened 8 months ago

colton-gabertan commented 8 months ago

Summary

The Ghidra Feature Extractor uses Ghidra's FunctionID Analyzer to identify library functions. capa as a standalone tool defaults to the Vivisect backend, which uses the FLARE team's custom FLIRT matching engine alongside an open-source set of FLIRT signatures that cover many functions present in binaries that are compiled with Visual Studio. The FLIRT matching engine is implemented in Rust with Python bindings.

Motivation

By integrating the FLIRT matching engine with the Ghidra Feature Extractor, it will complement FunctionID and service users who wish to use their own set of FLIRT signatures. In turn, capa will be capable of identifying more library functions to be skipped during analysis.

Additional context

FunctionID allows Ghidra users to develop and use their own sets of signatures already; however, new users will only have access to the default ones. Our FLIRT signature set will serve to automatically grow this database and enhance capa's analysis.

williballenthin commented 8 months ago

In order to use the python-flirt library, we'll want to reimplement this function from viv-utils to work on Ghidra: https://github.com/williballenthin/viv-utils/blob/35b7f7403b0befcb11bf2f66fc4ff28d6f87aada/viv_utils/flirt.py#L102

Basically, use FLIRT to scan each function prologue, and then if there are any recursive references, ensure those match, too. It's a little annoying but not impossible.

williballenthin commented 8 months ago

I wonder if this is better done outside of capa, as a standalone analysis enhancement by another Ghidra plugin, and then capa works better, versus doing the FLIRT matching within capa.