PolyGlot is a coverage-guided language fuzzer. It allows you to generate semantically correct test cases easily.
Features:
The original version that uses bison can be found here.
Install dependencies:
apt install -y cmake ninja-build build-essential
pip install conan
Clone the repo and build aflpp:
git clone https://github.com/OMH4ck/PolyGlot && cd PolyGlot && git submodule update --init && cd AFLplusplus && make -j
Build PolyGlot:
Put the following code in your grammar.g4:
options {
contextSuperClass = PolyGlotRuleContext;
}
@parser::header {
#include "polyglot_rule_context.h"
}
If you use separate lexer and parser file, you should include polyglot_rule_context.h
in both files.
Build with cmake:
If you have a single g4 file, you should set -DGRAMMAR_FILE=path/to/grammar.g4
. For example:
cmake -DCMAKE_BUILD_TYPE=Release -Bbuild -G Ninja -DBUILD_TESTING=OFF -DGRAMMAR_FILE=path/to/grammar.g4
If you have seperate grammar files for parser and lexer, you need to specify PARSER_FILE
and LEXER_FILE
instead.
For example:
cmake -DCMAKE_BUILD_TYPE=Release -Bbuild -G Ninja -DBUILD_TESTING=OFF -DPARSER_FILE=path/to/parser.g4 -DLEXER_FILE=path/to/lexer.g4
If you have additional helper cpp
files for antlr, put them all in a dir and specify -DGRAMMAR_HELPER_DIR=/path/to/dir
. PolyGlot
will use the cpp/cc
files when it compiles a parser for the language.
Then build with Ninja
:
ninja -C build
Mutation corpus: PolyGlot uses a corpus of seeds as the source of mutation. Such corpus should cover every rule in the grammar. You can run build/corpus_evaluate --corpus_dir your_corpus
to see what rules your corpus is not covering. The larger the corpus the better.
If you want to filter out unparsable inputs, run build/corpus_evaluate --corpus_dir your_corpus --output_dir sanitized_output
, which will copy all the parsable test cases to sanitized_output
.
A yaml semantic configuration file. This specifies the semantics of the language. This part is still in migration. For now, you can just copy the following content to a file named semantic.yml
:
---
InitFileDir: abs_path/to/mutation_corpus
IsWeakType: true
BasicTypes:
- X
You need to set InitFileDir
to be the path of your mutation corpus.
Set the environment variable
export POLYGLOT_CONFIG=abs_path/to/semantic.yml
export AFL_CUSTOM_MUTATOR_LIBRARY=abs_path/to/build/libpolyglot_mutator.so
export AFL_CUSTOM_MUTATOR_ONLY=1
export AFL_DISABLE_TRIM=1 # We haven't implemented trimming yet.
You can now run it with aflpp: afl-fuzz -i seed_corpus -o out -- ./your_target @@
. The seed_corpus
should be fully parsable by your grammar.
Language | Syntax Supported | Semantic Supported | Grammar provided | Corpus Provided |
---|---|---|---|---|
lua | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | |
php | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |