Go-fuzz is a coverage-guided fuzzing solution for testing of Go packages.
Fuzzing is mainly applicable to packages that parse complex inputs (both text
and binary), and is especially useful for hardening of systems that parse inputs
from potentially malicious users (e.g. anything accepted over a network).
Note: go-fuzz has recently added preliminary support for fuzzing Go Modules. See the section below for more details.
If you encounter a problem with modules, please file an issue with details. A workaround might be to disable modules via export GO111MODULE=off.
Usage
First, you need to write a test function of the form:
func Fuzz(data []byte) int
Data is a random input generated by go-fuzz, note that in most cases it is
invalid. The function must return 1 if the fuzzer should increase priority
of the given input during subsequent fuzzing (for example, the input is
lexically correct and was parsed successfully); -1 if the input must not be
added to corpus even if gives new coverage; and 0 otherwise; other values are
reserved for future use.
The Fuzz function must be in a package that go-fuzz can import. This means
the code you want to test can't be in package main. Fuzzing internal
packages is supported, however.
In its basic form the Fuzz function just parses the input, and
go-fuzz ensures that it does not panic, crash the program, allocate insane
amount of memory nor hang. Fuzz function can also do application-level checks,
which will make testing more efficient (discover more bugs). For example,
Fuzz function can serialize all inputs that were successfully deserialized,
thus ensuring that serialization can handle everything deserialization can
produce. Or, Fuzz function can deserialize-serialize-deserialize-serialize
and check that results of first and second serialization are equal. Or, Fuzz
function can feed the input into two different implementations (e.g. dumb and
optimized) and check that the output is equal. To communicate application-level
bugs Fuzz function should panic (os.Exit(1) will work too, but panic message
contains more info). Note that Fuzz function should not output to stdout/stderr,
it will slow down fuzzing and nobody will see the output anyway. The exception
is printing info about a bug just before panicking.
Here is an example of a simple Fuzz function for image/png package:
func Fuzz(data []byte) int {
img, err := png.Decode(bytes.NewReader(data))
if err != nil {
if img != nil {
panic("img != nil on error")
}
return 0
}
var w bytes.Buffer
err = png.Encode(&w, img)
if err != nil {
panic(err)
}
return 1
}
The second step is collection of initial input corpus. Ideally, files in the
corpus are as small as possible and as diverse as possible. You can use inputs
used by unit tests and/or generate them. For example, for an image decoding
package you can encode several small bitmaps (black, random noise, white with
few non-white pixels) with different levels of compressions and use that as the
initial corpus. Go-fuzz will deduplicate and minimize the inputs. So throwing in
a thousand of inputs is fine, diversity is more important.
Put the initial corpus into the workdir/corpus directory (in our case
examples/png/corpus). Go-fuzz will add own inputs to the corpus directory.
Consider committing the generated inputs to your source control system, this
will allow you to restart go-fuzz without losing previous work.
The go-fuzz-corpus repository contains
a bunch of examples of test functions and initial input corpuses for various packages.
The next step is to get go-fuzz:
$ go install github.com/dvyukov/go-fuzz/go-fuzz@latest github.com/dvyukov/go-fuzz/go-fuzz-build@latest
Then, download the corpus and build the test program with necessary instrumentation:
$ git clone https://github.com/dvyukov/go-fuzz-corpus.git
$ cd go-fuzz-corpus
$ cd png
$ go-fuzz-build
This will produce png-fuzz.zip archive.
Now we are ready to go:
$ go-fuzz
Go-fuzz will generate and test various inputs in an infinite loop. Workdir is
used to store persistent data like current corpus and crashers, it allows fuzzer
to continue after restart. Discovered bad inputs are stored in workdir/crashers
dir; where file without a suffix contains binary input, file with .quoted suffix
contains quoted input that can be directly copied into a reproducer program or a
test, file with .output suffix contains output of the test on this input. Every
few seconds go-fuzz prints logs to stderr of the form:
Where workers means number of tests running in parallel (set with -procs
flag). corpus is current number of interesting inputs the fuzzer has
discovered, time in brackets says when the last interesting input was
discovered. crashers is number of discovered bugs (check out
workdir/crashers dir). restarts is the rate with which the fuzzer restarts
test processes. The rate should be close to 1/10000 (which is the planned
restart rate); if it is considerably higher than 1/10000, consider fixing already
discovered bugs which lead to frequent restarts. execs is total number of
test executions, and the number in brackets is the average speed of test
executions. cover is number of bits set in a hashed coverage bitmap, if this number
grows fuzzer uncovers new lines of code; size of the bitmap is 64K; ideally cover
value should be less than ~5000, otherwise fuzzer can miss new interesting inputs
due to hash collisions. And finally uptime is uptime of the process. This same
information is also served via http (see the -http flag).
Modules support
go-fuzz has preliminary support for fuzzing Go Modules.
go-fuzz respects the standard GO111MODULE environment variable, which can be set to on, off, or auto.
go-fuzz-build will add a require for github.com/dvyukov/go-fuzz to your go.mod. If desired, you may remove this once the build is complete.
Vendoring with modules is not yet supported. A vendor directory will be ignored, and go-fuzz will report an error if GOFLAGS=-mod=vendor is set.
Note that while modules are used to prepare the build, the final instrumented build is still done in GOPATH mode.
For most modules, this should not matter.
libFuzzer support
go-fuzz-build can also generate an archive file
that can be used with libFuzzer
instead of go-fuzz (requires linux).
go-fuzz-build builds the program with gofuzz build tag, this allows to put the
Fuzz function implementation directly into the tested package, but exclude it
from normal builds with // +build gofuzz directive.
If your inputs contain a checksum, it can make sense to append/update the checksum
in the Fuzz function. The chances that go-fuzz will generate the correct
checksum are very low, so most work will be in vain otherwise.
Go-fuzz can utilize several machines. To do this, start the coordinator process
separately:
go-fuzz repository history was recently rewritten to exclude examples directory
to reduce total repository size and download time (see
#88,
#114 and
https://github.com/dvyukov/go-fuzz-corpus). Unfortunately, that means that
go get -u command will fail if you had a previous version installed.
Please remove $GOPATH/github.com/dvyukov/go-fuzz before running go get again.
Credits and technical details
Go-fuzz fuzzing logic is heavily based on american fuzzy lop,
so refer to AFL readme if you are
interested in technical details. AFL is written and maintained by
Michal Zalewski. Some of the mutations employed
by go-fuzz are inspired by work done by Mateusz Jurczyk, Gynvael Coldwind and
Felix Gröbert.
If you find some bugs with go-fuzz and are comfortable with sharing them, I would like to add them to this list. Please either send a pull request for README.md (preferable) or file an issue. If the source code is closed, you can say just "found N bugs in project X". Thank you.