umnsec / mlta

TypeDive: Multi-Layer Type Analysis (MLTA) for Refining Indirect-Call Targets
MIT License
76 stars 20 forks source link

TypeDive: Multi-Layer Type Analysis (MLTA) for Refining Indirect-Call Targets

This project includes a prototype implementation (TypeDive) of MLTA. MLTA relies on an observation that function pointers are commonly stored into objects whose types have a multi-layer type hierarchy; before indirect calls, function pointers will be loaded from objects with the same type hierarchy layer by layer. By matching the multi-layer types of function pointers and functions, MLTA can dramatically refine indirect-call targets. MLTA's approach is highly scalable (e.g., finishing the analysis of the Linux kernel within minutes) and does not have false negatives in principle.

TypeDive has been tested with LLVM 15.0, O0 and O2 optimization levels, and the Linux kernel. The finally results of TypeDive may have a few false negatives. Observed causes include hacky code in Linux (mainly the out-of-bound access from container_of), compiler bugs, and false negatives from the baseline (function-type matching).

How to use TypeDive

Build LLVM

    $ ./build-llvm.sh 
    # The tested LLVM is of commit e758b77161a7 

Build TypeDive

    # Build the analysis pass 
    # First update Makefile to make sure the path to the built LLVM is correct
    $ make 
    # Now, you can find the executable, `kanalyzer`, in `build/lib/`

Prepare LLVM bitcode files of OS kernels

Run TypeDive

    # To analyze a list of bitcode files, put the absolute paths of the bitcode files in a file, say "bc.list", then run:
    $ ./build/lib/kalalyzer @bc.list
    # Results will be printed out, or can you get the results in map `Ctx->Callees`.

Configurations

More details