rohanpadhye / JQF

JQF + Zest: Coverage-guided semantic fuzzing for Java.
BSD 2-Clause "Simplified" License
666 stars 112 forks source link

Tutorial without generator (Zest for the masses 1) #61

Closed floyd-fuh closed 4 years ago

floyd-fuh commented 5 years ago

I think for JQF to be successful, it needs to aim for as many developers as possible. Using JUnit is probably a very good starting idea. However, I think JQF should it make as easy as possible to create a fuzzing run. There are different aspects, I open a new issue for each of them.

First of all, I think Zest needs a tutorial that does not require writing a generator and just use the JUnit built-in generators. Because maybe a lot of developers only have trivial types they use for the things they want to fuzz (such as parsers that take InputStream and String).

This would make the current 101 tutorial to be 102 and this be the new 101. It works exactly as the current 101 tutorial, but does not require writing a generator class. It will also print Fuzzing stopped due to guidance exception: Assumption is too strong; too many inputs discarded when running JUnit, and Zest will eventually find a failure where java.lang.AssertionError: yZiou"8no should be a YES.

This tutorial introduces a trivial bug in a program and shows how to find it with Zest. We require a method to always return true if a String starts with "y". But it includes an artificial bug: even if the String starts with "y" but has the characters "no" in it, it will return false.

public class TrivialLogic{
    public static boolean isYes(String in) {
        if (in.contains("no"))
            return false;
    else if (in.startsWith("y"))
        return true;
    else
        return false;
    }
}
import static org.junit.Assert.*;
import static org.junit.Assume.*;

public class TrivialTest {

    public void testTrivial(String in) {
        assumeFalse(in.contains("may"); //special case we don't want our logic to test
    assumeTrue(in.startsWith("y"));
        assertTrue(in + " should be a YES", TrivialLogic.isYes(in));
    }
}
import static org.junit.Assert.*;
import static org.junit.Assume.*;
import org.junit.runner.RunWith;
import com.pholser.junit.quickcheck.*;
import com.pholser.junit.quickcheck.generator.*;
import edu.berkeley.cs.jqf.fuzz.*;
@RunWith(JQF.class)
public class TrivialTest {
    @Fuzz
    public void testTrivial(String in) {
        assumeFalse(in.contains("may"));
    assumeTrue(in.startsWith("y"));
        assertTrue(in + " should be a YES", TrivialLogic.isYes(in));
    }
}
yevgenypats commented 5 years ago

hi @floyd-fuh! is this kind of tutorial will work? https://github.com/fuzzitdev/example-java (I mean only the Zest part without the Fuzzit part)

yevgenypats commented 5 years ago

Regarding broader adoption - I also wrote the zest-cli which maybe more documentation for it will be helpful - currently the documentation is only available at the fuzzit example as it's experimental.

My plan for both broader adoption for JQF as well as Fuzzit (Continuous JQF) was to help OSS projects integrate JQF and fuzzit just like I did here for some of the projects. This also helps understand how easy it is to integrate JQF as well as to fix more bugs and spread JQF with "real-world" examples.

@floyd-fuh Currently I'm a bit swamped up but I was planning to get to this in a few weeks. Anyway we had a successful reward program for Go/Rust and I was planning to do something similar for Java so if you are available we can discuss, feel free to ping me at yp [at] fuzzit.dev

floyd-fuh commented 5 years ago

hi @floyd-fuh! is this kind of tutorial will work? https://github.com/fuzzitdev/example-java (I mean only the Zest part without the Fuzzit part)

Well, that tutorial is OK, but it is not as good as the one suggested above, as your tutorial could also be solved easily by using Quickcheck, whereas the above will need Zest because a lot of inputs are rejected.

floyd-fuh commented 5 years ago

My plan for both broader adoption for JQF as well as Fuzzit (Continuous JQF) was to help OSS projects integrate JQF and fuzzit just like I did here for some of the projects. This also helps understand how easy it is to integrate JQF as well as to fix more bugs and spread JQF with "real-world" examples.

@floyd-fuh Currently I'm a bit swamped up but I was planning to get to this in a few weeks. Anyway we had a successful reward program for Go/Rust and I was planning to do something similar for Java so if you are available we can discuss, feel free to ping me at yp [at] fuzzit.dev

I don't know exactly what fuzzit.dev is exactly, but from here it looks like a for-profit organization.

yevgenypats commented 5 years ago

@floyd-fuh yes, it is totally for profit (it's free for OSS). We help integrate coverage guided fuzzers into CI/CD workflow (i.e run them asynchronously and take care of the corpus, crashes etc..). Anyway, This is why I was asking if you are available for paid freelance project.

rohanpadhye commented 5 years ago

@floyd-fuh I agree that a tutorial without generators would be useful. But I am also not a fan of string and character based tutorials, since they reduce to something that is usually solvable by AFL, etc. The advantage of using junit-quickcheck over simply fuzzing main() is that we can generate complex input types, such as data structures -- things that are awkward to fuzz via AFL unless you find a good (de-)serialization for them.

I have a pretty decent example in the JQF tool paper which does not require using custom generators, though it requires a dependency on Apache Commons:

https://github.com/rohanpadhye/jqf/blob/d19e89e8b2e5062f7ec414bac8a182ff2b9ecc2d/examples/src/test/java/edu/berkeley/cs/jqf/examples/commons/PatriciaTrieTest.java#L64-L71

Zest generates an input that satisfies the assume but fails the assert in about 5-10 seconds on my laptop.

This is currently an open bug in Apache Commons that was found using Zest: COLLECTIONS-714

Plain quickcheck cannot find this even after several hours / millions of inputs. Of course, it does complain that the "assumption is too strong", which is sort of the point. You can also relax the assumption if you want quickcheck to say "Passed" and want Zest to say "Failed".

If you want to avoid the dependency on Apache Commons, I'm sure it will be possible to craft an example based on Maps/Sets/Trees that requires only in-app logic.

floyd-fuh commented 5 years ago

I see your point. Yes, let's create something simpler without dependencies, but more complex than strings/chars. I also think at the beginning it would be fine to just say there is an obvious bug in a function that takes for example an ArrayList. Although that would be fuzzable with JQF-AFL, it would need additional code that would turn an InputStream into an ArrayList of strings (with some kind of syntax we would need to make up). So it is already clear that JQF-Zest is the prefered way.

rohanpadhye commented 5 years ago

Sure, an example without generators and without dependencies seems useful. I would say that it should have at least one interesting assume(), as the validity feedback is really what sets Zest apart from AFL's algorithm.

floyd-fuh commented 5 years ago

Ok, so I thought about it for a while today and tried different things, but I didn't come up with anything really useful. I think a test like the following would be a really good example (it is missing exception handling yet) but it doesn't compile anyway:

import static org.junit.Assert.*;
import static org.junit.Assume.*;
import org.junit.runner.RunWith;
import com.pholser.junit.quickcheck.*;
import com.pholser.junit.quickcheck.generator.*;
import edu.berkeley.cs.jqf.fuzz.*;
import java.lang.ClassLoader;

@RunWith(JQF.class)
public class TrivialTest {
    @Fuzz
    public void testTrivial(String name, byte[] b, int off, int len) {
    // For reasons for the assumes see https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html#defineClass-java.lang.String-byte:A-int-int-
    assumeFalse(name.startsWith("java."));
        assumeFalse(off < 0);
    assumeFalse(len < 0);
    assumeTrue(off+len <= b.length);
        (new ClassLoader()).defineClass(name, b, off, len);
    }
}

It doesn't compile on Java 11, but I think it would have been interesting to see what happens. On the other hand not having input seeds for byte[] b and no generator for it would have probably lead nowhere...

So I think maybe a dependency is not too bad if we use maven. It's only a couple of lines in the pom.xml then. Also I think maybe having an example with multiple String/byte[] as arguments is also good, because that is not easily fuzzed with JQF-AFL...

I tried to think of something that would at the same time also have a higher impact (potentially finding a security issue), but I landed too often at more complicated examples such as differential fuzzing for cryptography bugs. So if you don't have a better idea, I would say let's go with the PatriciaTrieTest.java.

rohanpadhye commented 5 years ago

I also prefer the PatriciaTrie example myself. I find data structures such as collections easier to reason about than Java internals such as class objects, strings (which have 2-byte characters), image/IO streams, etc.

I would guess that things like maps, sets, lists, are commonly understood even by developers who are not deeply familiar with the quirks of Java.

vincent-tian commented 5 years ago

Hi, I’m a developer,The JQF is very beautiful.but there is two points i can’t understand,The zestGuidance is how to collect the coverages and there must be a file that was filled by seeds as the “seedsinputfile”.