Neural-Fuzzer is an experimental fuzzer designed to use state-of-the-art Machine Learning to learn from a set of initial files. It works in two phases: training and generation.
In training mode: it uses long-short term memory (LSTM) to learn how sequences of bytes are structured.
In generation mode: it will automatically generate corrupted or unexpected files and it will try to crash a given program.
Neural-Fuzzer is open-source (GPL3), powered by keras and it is similar to rnn-char and other techniques in sequence prediction.
We need install the required libraries. For instance, in Debian/Ubuntu:
# apt-get install python-numpy libhdf5-dev gdb
After that, we can start installing neural-fuzzer:
$ git clone https://github.com/CIFASIS/neural-fuzzer/
$ cd neural-fuzzer
$ python setup.py install --user
In order to generate XML, we can use one of the pre-trained XML generators:
$ wget "https://github.com/CIFASIS/neural-fuzzer/releases/download/0.0/0-gen-xml.lstm"
$ wget "https://github.com/CIFASIS/neural-fuzzer/releases/download/0.0/0-gen-xml.lstm.map"
(more generators are available here)
Then, we need a seed to start the generation. For instance, to use '>'
$ mkdir seeds
$ printf ">" > seeds/input.xml
Finally, we can start producing some random xmls using the generators:
$ ./neural-fuzzer.py --max-gen-size 64 0-gen-xml.lstm seeds/
Using Theano backend.
Using ./gen-449983086021 to store the generated files
Generating a batch of 8 file(s) of size 35 (temp: 0.5 )...................................
The resulting files will be stored in a randomly named directory (e.g gen-449983086021). It is faster to generate files in a batch, instead of one by one (you can experiment with different batch sizes). In this case, one of the files we obtained is this one:
></p>
<p><termdef id='dt-encoding'>
An interesting parameter is the maximum size of the generated file. Another important parameter the temperature which takes a number in range (0, 1] (default = 0.5). As karpathy explains, the temperature is dividing the predicted log probabilities before the Softmax, so lower temperature will cause the model to make more likely, but also more boring and conservative predictions. Higher temperatures cause the model to take more chances and increase diversity of results, but at a cost of more mistakes.
Testing and triage of crashes using GDB can be performed using neural-fuzzer. For instance, if we want to test two XML parsing implementations libxml2 and expat:
$ ./neural-fuzzer.py --max-gen-size 64 0-gen-xml.lstm seeds/ --cmd "/usr/bin/xmllint @@" "/usr/bin/xmlwf @@"
TODO