rohanpadhye / JQF

JQF + Zest: Coverage-guided semantic fuzzing for Java.
BSD 2-Clause "Simplified" License
647 stars 109 forks source link
afl coverage-guided-fuzzing fuzzing junit property-based-testing quickcheck

JQF + Zest: Semantic Fuzzing for Java

Build

JQF is a feedback-directed fuzz testing platform for Java (think: AFL/LibFuzzer but for JVM bytecode). JQF uses the abstraction of property-based testing, which makes it nice to write fuzz drivers as parameteric JUnit test methods. JQF is built on top of junit-quickcheck. JQF enables running junit-quickcheck style parameterized unit tests with the power of coverage-guided fuzzing algorithms such as Zest.

Zest is an algorithm that biases coverage-guided fuzzing towards producing semantically valid inputs; that is, inputs that satisfy structural and semantic properties while maximizing code coverage. Zest's goal is to find deep semantic bugs that cannot be found by conventional fuzzing tools, which mostly stress error-handling logic only. By default, JQF runs Zest via the simple command: mvn jqf:fuzz.

JQF is a modular framework, supporting the following pluggable fuzzing front-ends called guidances:

JQF has been successful in discovering a number of bugs in widely used open-source software such as OpenJDK, Apache Maven and the Google Closure Compiler.

Zest Research Paper

To reference Zest in your research, we request you to cite our ISSTA'19 paper:

Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves Le Traon. 2019. Semantic Fuzzing with Zest. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’19), July 15–19, 2019, Beijing, China. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3293882.3330576

JQF Tool Paper

If you are using the JQF framework to build new fuzzers, we request you to cite our ISSTA'19 tool paper as follows:

Rohan Padhye, Caroline Lemieux, and Koushik Sen. 2019. JQF: Coverage-Guided Property-Based Testing in Java. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’19), July 15–19, 2019, Beijing, China. ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3293882.3339002

Overview

What is structure-aware fuzzing?

Binary fuzzing tools like AFL and libFuzzer treat the input as a sequence of bytes. If the test program expects highly structured inputs, such as XML documents or JavaScript programs, then mutating byte-arrays often results in syntactically invalid inputs; the core of the test program remains untested.

Structure-aware fuzzing tools leverage domain-specific knowledge of the input format to produce inputs that are syntactically valid by construction. There are some nice articles on structure-aware fuzzing of C++ and Rust programs using libFuzzer.

What is generator-based fuzzing (QuickCheck)?

Structure-aware fuzzing tools need a way to understand the input structure. Some other tools use declarative specifications of the input format such as context-free grammars or protocol buffers. JQF uses QuickCheck's imperative approach for specifying the space of inputs: arbitrary generator programs whose job is to generate a single random input.

A Generator<T> provides a method for producing random instances of type T. For example, a generator for type Calendar returns randomly-generated Calendar objects. One can easily write generators for more complex types, such as XML documents, JavaScript programs, JVM class files, SQL queries, HTTP requests, and many more -- this is generator-based fuzzing. However, simply sampling random inputs of type T is not usually very effective, since the generator does not know if the inputs that it produces are any good.

What is semantic fuzzing (Zest)?

JQF supports the Zest algorithm, which uses code-coverage and input-validity feedback to bias a QuickCheck-style generator towards generating structured inputs that can reveal deep semantic bugs. JQF extracts code coverage using bytecode instrumentation, and input validity using JUnit's Assume API. An input is valid if no assumptions are violated.

Example

Here is a JUnit-Quickcheck test for checking a property of the PatriciaTrie class from Apache Commons Collections. The property tests that if a PatriciaTrie is initialized with an input JDK Map, and if the input map already contains a key, then that key should also exist in the newly constructed PatriciaTrie.

@RunWith(JQF.class)
public class PatriciaTrieTest {

    @Fuzz  /* The args to this method will be generated automatically by JQF */
    public void testMap2Trie(Map<String, Integer> map, String key) {
        // Key should exist in map
        assumeTrue(map.containsKey(key));   // the test is invalid if this predicate is not true

        // Create new trie with input `map`
        Trie trie = new PatriciaTrie(map);

        // The key should exist in the trie as well
        assertTrue(trie.containsKey(key));  // fails when map = {"x": 1, "x\0": 2} and key = "x"
    }
}

Running mvn jqf:fuzz causes JQF to invoke the testMap2Trie() method repeatedly with automatically generated values for map and key. After about 5 seconds on average (~5,000 inputs), JQF will report an assertion violation. It finds a bug in the implementation of PatriciaTrie that is unresolved as of v4.4. Random sampling of map and key values is unlikely to find the failing test case, which is a very special corner case (see the comments next to the assertion in the code above). JQF finds this violation easily using a coverage-guided called Zest. To run this example as a standalone Maven project, check out the jqf-zest-example repository.

In the above example, the generators for Map and String were synthesized automatically by JUnitQuickCheck. It is also possible to specify generators for structured inputs manually. See the tutorials below.

Documentation

Tutorials

Continuous Fuzzing

GitLab supports running JQF in CI/CD (tutorial), though they have recently rolled out their own custom Java fuzzer for this purpose.

Research and Tools based on JQF

🍝 = Involves at least one of the original JQF authors.

Contact the developers

If you've found a bug in JQF or are having trouble getting JQF to work, please open an issue on the issue tracker. You can also use this platform to post feature requests.

If it's some sort of fuzzing emergency you can always send an email to the main developer: Rohan Padhye.

Trophies

If you find bugs with JQF and you comfortable with sharing, We would be happy to add them to this list. Please send a PR for README.md with a link to the bug/cve you found.