ZSShen / MeltingPot

A tool to cluster similar executables (PEs, DEXs, and etc), extract common signature, and generate Yara patterns for malware detection.
MIT License
24 stars 5 forks source link
c malware-detection malware-research malware-samples ssdeep yara

Build Status

MeltingPot

MeltingPot is an automated common binary signature extractor and pattern generator. For the given sample set with the same file format, it slices each file into small pieces of binary sequences and correlates the files sharing the similar sequences. To show the result, MeltingPot generates a set of YARA formatted patterns each of which represents the common signature of a certain file cluster. Such patterns can be directly applied by YARA scan engine.

Introduction

MeltingPot is composed of the main engine and the supporting plugins. The relation is briefly illustrated here:

The example pattern for the Windows Notepad and its packed version using several kinds of software protectors.

As mentioned above, we have three kinds of plugins:

If analysts intend for custom research purpose, they can craft the custom plugins in the plugin source directory and use the MeltingPot build script to create the libraries.

Installation

Basic

First of all, we need to prepare the following utilities:

For Ubuntu 12.04 and above, it should be easy:

$ sudo apt-get install -qq cmake
$ sudo apt-get install -qq valgrind
$ sudo apt-get install -qq libfuzzy-dev
$ sudo apt-get install -qq libglib2.0-dev
$ sudo apt-get install -qq libconfig-dev

Now we can build the entire source tree under the project root folder:

$ ./clean.py --rebuild
$ cd build
$ cmake ..
$ make

Then the engine should be under:

And the relevant plugins should be under:

Advanced

If we modify the main engine or the plugins, we can move to the corresponding subdirectory to rebuild the binary.
To build the engine independently:

$ cd engine
$ ./clean.py --rebuild
$ cd build
$ cmake .. -DCMAKE_BUILD_TYPE=Debug|Release
$ make

Note that we have two build types.
For debug build, the compiler debug flags are turned on, and the binary locates at ./engine/bin/debug/cluster.
For optimized build, the binary locates at ./engine/bin/release/cluster.

To build the plugin independently (using File Slicing plugin as example):

$ cd plugin/slice
$ ./clean.py --rebuild
$ cd build
$ cmake .. --DCMAKE_BUILD_TYPE=Debug|Release
$ make

Again, we must specify the build type for compiliation.
For debug build, the binary locates at ./plugin/slice/debug/libslc_*.so.
For optimized build, the binary locates at ./plugin/slice/release/libslc_*.so.
For the other two kinds of plugins, the build rule is the same.

Usage

To run the engine, we should specify some relevant configurations. The example is shown in ./engine/cluster.conf.
We discuss these parameters below:

Parameter Description
SIZE_SLICE The size of the sliced file binary
SIZE_HEX_BLOCK The length of the signature extracted from a slice cluster
COUNT_HEX_BLOCK The number of to be extracted signatures from a cluster
THRESHOLD_SIMILARITY The threshold to group similar slices
RATIO_NOISE The ratio of dummy bytes (00 or ff) in a signature
RATIO_WILDCARD The ratio of wildcard characters in a signature
TRUNCATE_GROUP_SIZE_LESS_THAN The threshold to ignore trivial clusters
FLAG_COMMENT The knob for pattern comments
PATH_ROOT_INPUT The pathname of input sample set
PATH_ROOT_OUTPUT The pathname of output pattern folder
PATH_PLUGIN_SLICE The pathname of the file slicing plugin
PATH_PLUGIN_SIMILARITY The pathname of similarity comparison plugin
PATH_PLUGIN_FORMAT The pathname of pattern formation plugin

In addition, we have the following advanced parameters:

Parameter Description
COUNT_THREAD The number of running threads
IO_BANDWIDTH The maximum number of files a thread can simultaneously open

With the configuration file prepared, we can launch the MeltPot engine:

Contact

Please contact me via the mail andy.zsshen@gmail.com.