Support coverage-guided generation

vlsi commented 4 years ago

Hi,

There's https://github.com/rohanpadhye/jqf by @rohanpadhye The idea here is that a randomized-input generation can be guided based on the feedback from the test execution.

For instance, guidance can observe code coverage and it could attempt to produce inputs that explore uncovered branches. Another example might be to measure the performance (e.g. performance counters or just heap allocation) and attempt to produce an input that triggers performance issues.

Relevant information

https://www.fuzzingbook.org/html/MutationFuzzer.html#Guiding-by-Coverage (and other parts of the book)

Suggested Solution

If we take Zest (which is coverage-guided fuzzer), then ZestGuidance provides an input stream that can be considered as the source of randomness: https://github.com/rohanpadhye/jqf/blob/48d3b663ad68a7b615c2a8b9716da0ca8b6ef4e6/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java#L136

As far as I can see, the current RandomizedShrinkablesGenerator never exposes Random instance, so there's no way to take a random stream input from the guidance and pass it to RandomizedShrinkablesGenerator (see https://github.com/jlink/jqwik/blob/2517d6d8cd612137fcff730a9114169260fad4bf/engine/src/main/java/net/jqwik/engine/execution/CheckedProperty.java#L155 )

As far as I understand, jqwik does not provide a way to plug guided fuzzing: https://github.com/jlink/jqwik/blob/5632ef8ca3d51ff083380257fb0b2b9bd7383920/engine/src/main/java/net/jqwik/engine/properties/GenericProperty.java#L38

So I would like to hear your opinion on the way to integrate guided fuzzing. It looks like it would be nice to be able to use pluggable guidances, so what do you think if the random in question was taken from a Guidance instance and the guidance get notified on the test outcomes?

PS. I wonder if jqwilk engine can be integrated with JUnit5's default one.

I know JUnit5's default engine does not suit very well for property-based tests, however, can you please clarify what are the major issues that prevent the use of JUnit5 engine for jqwik?

For instance, JUnit5 (e.g. TestTemplate) keeps all the test outcomes (which results in OOM), however, in property-based tests we don't need that, and we need to just count the number of passed tests. On the other hand, it looks like that can be improved in JUnit (e.g. by adding @AggregateSuccessfulExecutionsAsCount or something like that).

I just thought that the default JUnit5 engine would make the tests easier to write as the documentation would be consolidated. For instance, there's net.jqwik.api.Disabled, and there's org.junit.jupiter.api.Disabled which are more or less the same thing. It is unfortunate that regular annotations do not work with jqwik. Of course, it would be illegal to mix @jupiter.api.Test with @Property, however, that looks like a corner case that can be validated.

jlink commented 4 years ago

@vlsi Thanks for the suggestion. Guided generation looks interesting. Can you give a concrete example how you imagine the integration of ZEST with jqwik could look like for the user?

As for the integration, it would most probably require a new kind of generation mode and a way to inject the random stream and the guidance. My intuition says it should be possible but it will require deeper understanding on my side how Zest and JQF work. If you can convince me with good examples from user perspective I'm absolutely willing to tackle it :-)

vlsi commented 4 years ago

Can you give a concrete example how you imagine the integration of ZEST with jqwik could look like for the user?

Let us take Apache Calcite as an example. It is SQL optimization and execution engine.

SQL has expressions (e.g. foo is null and foo >0), and, as you might guess, Calcite uses those expressions all over the place.

Of course, there's an implementation of the expression optimizer. Here's a test for that: https://github.com/apache/calcite/blob/3acb30875525e029be96726a357ee9950cce3310/core/src/test/java/org/apache/calcite/rex/RexProgramTest.java#L1734-L1738 It verifies that NOT COALESCE(null, true) is simplified to false.

So the optimizer has properties to be verified. For instance, the optimizer must not throw exceptions while doing its optimization.

The current jqwik / junit-quickcheck approach would be to generate expression trees at random, and pass them to the optimizer. However, ZEST allows to pick input tress better, so it skews input generation in such a way, so it increases the probability to trigger a bug.

This case for the end user could look as follows:

public class RexFuzzerTest {
  private RexSimplify rexOptimizer;

  public JqfFuzzerTest() {
    rexOptimizer = null; // setup the optimizer appropriately
  }

  @Property
  // or @Property(coverageGuidance=true)
  // or @Property @Guidance(type=ZEST) @ZestParams(include="org.apache.calcite.*")
  public void hello(@ForAll RexNode value) {
    rexOptimizer.simplify(value); // we check that optimizer does not throw exceptions 
  }
}

Note: I am not sure regarding the way guidance should be configured. At the end of the day, I don't think you can (easily) add a javaagent via @Annotation :)

The user would have to enable JQF's bytecode instrumentation somehow. It is available as a custom classloader or as a javaagent. It would be fine to leave javaagent configuration to the build systems. I think it would be more-or-less manageable.

Note: currently JQF is integrated with junit-quickcheck, and JQF itself has only one annotation property

Does that answer your question?

In other words, from an end-user perspective, the very basic case is the same as with simple jqwik.

What ZEST does, it analyzes the execution outcomes, and it provides you with a random bytes, so the next randomized inputs would explore uncovered branches better.

If you ask for more advanced cases then it would be nice to be able to see extra statistics from ZEST itself. For instance, it would be nice to see coverage% in the real-time, so the user can see if the fuzzer is healthy or not.

it will require deeper understanding on my side how Zest and JQF work

You might want to check Guidance interface. It is quite small.

https://github.com/rohanpadhye/jqf/blob/06376192315a80309d6bd298b7517ef50f5bdb16/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/guidance/Guidance.java#L53-L72

jlink commented 4 years ago

@vlsi As for your second question:

I wonder if jqwilk engine can be integrated with JUnit5's default one.

I'm probably never going to do that. Here's my argument:

Jupiter, the default engine, is a jack-of-all-trades, which in turn means that it's not well suited for special needs. Jupiter exentsion points have been designed to be composable, that's e.g. the reason why JUnit4's model of wrapping the execution of tests and suites was given up and replaced by before and after hooks. Nevertheless, generic composability is difficult to get right, and it requires to stick with some limitations. Jupiter's lifecycle is not really well suited for running an indeterminate number of tests scheduled by the feedback coming from previous test runs. Dynamic test generation allows that to some degree but then you will miss out on shrinking, statistics reporting and a few other subtleties that make developer experience of jqwik the way it is now - and as I like it.

That said, it would probably be possible to come up with a jqwik extension for Jupiter that could inject parameters, run the individual tries as dynamic tests and even do standard shrinking. But it would

Not offer the same developer experience as for the lifecycle (rerun with previous failure, run failing props first, switch platform reporting on/off, etc.)
Be (much) harder to integrate in Jupiter's lifecycle and make it usable/composable with most other extensions. Just think of the effect that other injected parameters can have on the predictability for shrinking results. And if you take composability away then a Jupiter integration has IMO no additional benefit.

Junit platform test engines - as jqwik is an example - have been introduced to fill the hole that Jupiter extensions leave empty: Different syntax to specify tests, different test lifecycle, difficulties with composability and side effects. BTW, there's a reason that junit-quickcheck has its own JUnit4 runner and is not just a set of JUnit 4 rules: Runners in JUnit 4 are (to some degree) what engines are in JUnit 5; they are not composable.

As you mention with @Disabled the engine approach has one major drawback: some functionality that could be considered cross-engine must be implemented multiple times and might function inconsistently/differently from engine to engine. Since I am still - despite my exit from the JUnit 5 team - in good contact with those folks we've always been looking for opportunities to extract common functionality from individual engines into the platform. Sometimes we find a reasonable solution (e.g. testing kit, annotation discovery), sometimes we don't.

jlink commented 4 years ago

The user would have to enable JQF's bytecode instrumentation somehow. It is available as a custom classloader or as a javaagent. It would be fine to leave javaagent configuration to the build systems. I think it would be more-or-less manageable.

javaagent configuration is mostly done by the IDE or the build system, so this should be doable.

I still don't get how providing random bytes can direct the generation of certain values since the values generated from certain random bytes cannot really be predicted without knowing the implementation of all generators involved - and that can be many. What do I miss or fail to understand?

vlsi commented 4 years ago

I still don't get how providing random bytes can direct the generation of certain values since the values generated from certain random bytes cannot really be predicted without knowing the implementation of all generators involved - and that can be many

I guess it is described in zest paper (see 3.1 Parametric Generators)

direct

AFAIU, the idea is pretty much the same as the mutation and selection in a typical genetic optimizer directs the population to the desired goal. In the same way, Zest mutates the input sequence, and it measures the resulting coverage.

Genetic optimizers do not need to know the implementation of the underlying systems.

cannot really be predicted without knowing the implementation of all generators involved

It does not need to predict the output. For instance, it might learn that mutating bits 0..32 is very bad for coverage (==the first 32bits are magic sequence), so it might focus on mutating other bits. It does not require to know the implementation of the generators as long as they are predictable given the source of randomness.

luvarqpp commented 4 years ago

I would like to see annotation in line with existing approach. For example:

@Property(generation = GenerationMode.ZEST_DRIVEN, shrinking = ShrinkingMode.FULL)
@Report(Reporting.COVERAGE)
void arbitraryUserIsAlwaysValidForSerDes(@ForAll("arbitraryUser") UserForTests arbitraryUser) {
}

PS: I like proposed idea. Does it mean that tuning properties to coverage various cases would not be needed generally? (genetics programming can be good in many cases)

vlsi commented 4 years ago

Does it mean that tuning properties to coverage various cases would not be needed generally?

Frankly speaking, I have no idea at this point. In general, it looks to be useful even without tuning knobs. ZestGuidance itself has multiple hard-coded parameters that might need to be configurable in the future. I want to try the thing with real code, and then we could see what needs to be tuned.

The use cases I have in mind are Apache Calcite (RexNode fuzzing or even full SQL fuzzing), and PostgreSQL JDBC driver (e.g. JDBC API fuzzing).

PS. It would probably be fun to have Arbitrary implementation based on JavaCC grammars.

luvarqpp commented 4 years ago

Does it mean that tuning properties to coverage various cases would not be needed generally?

Frankly speaking, I have no idea at this point. In general, it looks to be useful even without tuning knobs. ZestGuidance itself has multiple hard-coded parameters that might need to be configurable in the future. I want to try the thing with real code, and then we could see what needs to be tuned.

Thanks for info. I can use it (when integrated with jqwik) as soon as there is something to test. I have spring-data-rest real world project (in alpha version through) which will reach its customers this year and testing is one of our priority.

The use cases I have in mind are Apache Calcite (RexNode fuzzing or even full SQL fuzzing), and PostgreSQL JDBC driver (e.g. JDBC API fuzzing).

To find bugs more easily, you can try also "reactive jdbc driver". See its postgresql client code for example: https://github.com/r2dbc/r2dbc-postgresql

vlsi commented 4 years ago

I have spring-data-rest real world project (in alpha version through) which will reach its customers this year and testing is one of our priority.

I'm not sure the current Zest/JQF supports multi-threaded applications :-/

luvarqpp commented 4 years ago

I have spring-data-rest real world project (in alpha version through) which will reach its customers this year and testing is one of our priority.

I'm not sure the current Zest/JQF supports multi-threaded applications :-/

Jop, you are right, through using it when business logic is in main thread only, would be OK. See https://github.com/rohanpadhye/jqf/issues/41. On the other side, this is not my case. In spring-data-rest there is some tomcat in main thread (perhaps), or main thread is my web test client (http client, which is sending http requests from jqwik test).

jlink commented 4 years ago

@vlsi I looked at the examples coming with JQF. They don't seem to define any special case guidance which confuses me. Take this simple example:

@Fuzz
public void insert(int @Size(min=100, max=100)[] elements) {
     BinaryTree b = new BinaryTree();
     for (int e : elements) {
         b.insert(e);
     }
}

What difference in generated data would you expect - if any - compared to

@Property
public void insert(int @Size(min=100, max=100)[] elements) {
     BinaryTree b = new BinaryTree();
     for (int e : elements) {
         b.insert(e);
     }
}

running on Vanilla junit-quickcheck?

vlsi commented 4 years ago

I looked at the examples coming with JQF. They don't seem to define any special case guidance which confuses me

PS. We can have a small conversation (e.g. Zoom, or https://meet.jit.si/) so screen sharing+voice might be more productive.

JQF implements its own JUnit runner: https://github.com/rohanpadhye/jqf/blob/f0979677984bf61e505104fcbda27b442cc561a1/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/JQF.java#L50

which uses FuzzStatement for test execution: https://github.com/rohanpadhye/jqf/blob/f0979677984bf61e505104fcbda27b442cc561a1/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/JQF.java#L87

FuzzStatement itself asks Guidance for the next input data to try (==next random bytes): https://github.com/rohanpadhye/jqf/blob/48d3b663ad68a7b615c2a8b9716da0ca8b6ef4e6/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java#L136-L137

And it calls generators with the given random (it is the place where usual generators are made to be guided ones): https://github.com/rohanpadhye/jqf/blob/48d3b663ad68a7b615c2a8b9716da0ca8b6ef4e6/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java#L139-L141

Then it informs Guidance on the actual test outcome (pass/fail/assumption violation/etc): https://github.com/rohanpadhye/jqf/blob/48d3b663ad68a7b615c2a8b9716da0ca8b6ef4e6/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java#L194

vlsi commented 4 years ago

Note: ZestGuidance is instantiated and passed via static field in GuidedFuzzing I don't quite like that approach, but it works (tm).

Frankly speaking, the current Zest / JQF code seems to be focused to be used as public static void main application rather than to be used as a library. For instance, ZestGuidance is instantiated in ZestCli

Is it what you are looking for?

jlink commented 4 years ago

PS. We can have a small conversation (e.g. Zoom, or https://meet.jit.si/) so screen sharing+voice might be more productive.

Would be cool. When do you have time this week?

vlsi commented 4 years ago

Would be cool. When do you have time this week?

Feel free to select a suitable timeslot (15-20-30 min?) at https://doodle.com/vlsi

luvarqpp commented 4 years ago

Would be cool. When do you have time this week?

Feel free to select a suitable timeslot (15-20-30 min?) at https://doodle.com/vlsi

Please, post some "key points" as result of talk if possible. It is nice reading and interesting approach getting alive :)

rohanpadhye commented 4 years ago

Hi! I am the main developer of JQF.

@jlink

What difference in generated data would you expect

I added an example in the README to demonstrate a good use case. I can copy that here:

    @Fuzz  /* The args to this method will be generated automatically by JQF */
    public void testMap2Trie(Map<String, Integer> map, String key) {
        // Key should exist in map
        assumeTrue(map.containsKey(key));   // the test is invalid if this predicate is not true

        // Create new trie with input `map`
        Trie trie = new PatriciaTrie(map);

        // The key should exist in the trie as well
        assertTrue(trie.containsKey(key));  // fails when map = {"x": 1, "x\0": 2} and key = "x"
    }

Running only random sampling reveals no assertion violations even after hours of input generation. In fact, most inputs do not even satisfy the assume() statement. With JQF we can find an assertion violation (and thus the bug COLLECTIONS-714) in just a few seconds. JQF uses feedback from the result of assume as well as from the code coverage of classes such asPatriciaTrie.java.

(You might argue that this property test can be written in a better way that might help find the bug via random sampling alone, but this is just an extreme example showing the utility of coverage guidance)

@vlsi

Note: ZestGuidance is instantiated and passed via static field in GuidedFuzzing I don't quite like that approach, but it works (tm).

Just to clarify, this is so that instrumented class files can report coverage events to the unique guidance when executing branches and method calls. This is similar to coverage tools such as JaCoCo, which use static fields to collect code coverage. Having multiple guidances live simultaneously would require multiple versions of an instrumented class. Although it could be possible to bind guidances to class-loader instances, there hasn't been a need for such a setup as of yet.

@vlsi

Frankly speaking, the current Zest / JQF code seems to be focused to be used as public static void main application rather than to be used as a library. For instance, ZestGuidance is instantiated in ZestCli

JQF has a Maven plugin, which lets you run (Zest by default) via mvn jqf:fuzz, though it can also be run via CLI. It may be possible to make guided fuzzing run as part of the standard test suite (e.g. mvn test in Maven), as long as it is possible to specify a classloader that instruments application classes to report coverage information.

vlsi commented 4 years ago

Rohan: Just to clarify, this is so that instrumented class files can report coverage events to the unique guidance when executing branches and method calls. This is similar to coverage tools such as JaCoCo, which use static fields to collect code coverage. Having multiple guidances live simultaneously would require multiple versions of an instrumented class

Thanks, that is useful, however, it looks like you answer to a slightly different question.

I agree you need to talk to the instrumentation engine. And you do that by SingleSnoop.setCallbackGenerator(guidance::generateCallBack) in https://github.com/rohanpadhye/jqf/blob/47d20a9e16decc0800227247f029c0f1696afd02/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/GuidedFuzzing.java#L163

That is fine, and you call instrumentation API (which is SingleSnoop) and ask it to provide events.

However, GuidedFuzzing#setGuidance and GuidedFuzzing#getCurrentGuidance are never used in the instrumentation. The guidance is passed as a static field GuidedFuzzing#guidance which makes it harder to understand the way instantiation of zest guidance is connected with its use in the fuzz junit4 runner.

I agree JUnit4 runner does not have room for passing guiding instance, however, I guess it would make code easier to follow if you called GuidedFuzzing#getCurrentGuidance in JQF.java and passed it to new FuzzStatement(...). In other words, FuzzStatement obviously requires knowledge of the coverage information. However, FuzzStatement constructor receives nothing related to coverage. That implies FuzzStatement is doing some magic in the implementation, which makes the thing hard to understand.

Of course, JQF.java and FuzzStatement can't really be reused in jqwik, so the discussion here is for understanding purposes only.

Rohan: JQF has a Maven plugin, which lets you run (Zest by default) via mvn jqf:fuzz, though it can also be run via CLI. It may be possible

@rohanpadhye , What I mean by library is as follows: https://www.programcreek.com/2011/09/what-is-the-difference-between-a-java-library-and-a-framework/ Currently, JQF/Zest is inclined towards the framework rather than a library, which makes it harder to integrate it into other libraries and applications.

The only use of mvn jqf:fuzz for jqwik is it can be used as a source of inspiration. It does not mean jqf-maven-plugin is bad. It means just that zest/jqf as a library use case is not documented/tested.

jlink commented 4 years ago

@rohanpadhye

Hi! I am the main developer of JQF.

Thanks for chiming in. Much appreciated.

The example is a motivating one that I could strive for to get running with jqwik as a first step. When I try to run it with mvn jqf:fuzz but my Maven won't find the jqf plugin. Any hint on that?

As for the instrumenting classloader, would it suffice to load the test container class through this classloader? I'm wondering which kind of lifecycle hook might be needed.

Last question: Would you be willing to move all junit4/junit-quickcheck related code in a separate artefact - or the other way round to create a jqf-core module?

rohanpadhye commented 4 years ago

@vlsi

I guess it would make code easier to follow if you called GuidedFuzzing#getCurrentGuidance in JQF.java and passed it to new FuzzStatement(...).

That's a reasonable suggestion and I'll open an issue to make this change.

Currently, JQF/Zest is inclined towards the framework rather than a library, which makes it harder to integrate it into other libraries and applications.

Yes, that is correct. There wasn't a need for exposing the fuzzing engine as a library as of now. However, if separating components into reusable library-like packages helps projects like jqwik, I am fine with refactoring.

@jlink

When I try to run it with mvn jqf:fuzz but my Maven won't find the jqf plugin. Any hint on that?

I created a standalone example here: https://github.com/rohanpadhye/jqf-zest-example. Let me know if following the README works.

As for the instrumenting classloader, would it suffice to load the test container class through this classloader? I'm wondering which kind of lifecycle hook might be needed.

I am not sure of this. JQF currently uses a separate plugin because of the control of specifying a custom classloader. I am not sure how to change the classloader used by the default test runner.

Would you be willing to move all junit4/junit-quickcheck related code in a separate artefact - or the other way round to create a jqf-core module?

Handling the dependency on junit-quickcheck should be straightforward, since all the dependent classes are already isolated into one directory. Regarding dependency on JUnit4, I will have to investigate further as to whether there is anything outside of this package that depends on JUnit classes. @vlsi already opened an issue in rohanpadhye/jqf#80 for this, which can be used if there is a need for this refactoring.

rohanpadhye commented 4 years ago

I am not sure of this. JQF currently uses a separate plugin because of the control of specifying a custom classloader. I am not sure how to change the classloader used by the default test runner.

To clarify, this is what the JQF Maven plugin currently does:

https://github.com/rohanpadhye/jqf/blob/143e42f3e070492de8ee34f2a94d4fdac0cb2eeb/maven-plugin/src/main/java/edu/berkeley/cs/jqf/plugin/FuzzGoal.java#L250-L254

I am not sure how to get this effect without using a custom plugin (i.e., how to change the classloader used by mvn test).

jlink commented 4 years ago

@rohanpadhye I could get the isolated example to run (many thanks!). It does not find a unique failure, though. Using mvn:repro fails with

Cannot find or open file target/fuzz-results/examples.PatriciaTrieTest/testMap2Trie/failures/id_000000

BTW, the following jqwik property is able to detect the PatriciaTrie bug:

@Property(tries = 10000, maxDiscardRatio = 100)
void testMap2Trie(
        @ForAll Map<@StringLength(min = 1, max = 10) String, Integer> map,
        @ForAll @StringLength(min = 1, max = 10) String key
) {

    // Key should exist in map
    Assume.that(map.containsKey(key));   // the test is invalid if this predicate is not true

    // Create new trie with input `map`
    Trie trie = new PatriciaTrie(map);

    // The key should exist in the trie as well
    assertTrue(trie.containsKey(key));  // fails when map = {"x": 1, "x\0": 2} and key = "x"
}

You have to use the latest snapshot version "1.2.3-SNAPSHOT" though, because I had forgotten to add \u0000 as default edge case for character generation.

I am not sure how to get this effect without using a custom plugin (i.e., how to change the classloader used by mvn test).

This will require some experimentation to find out. In the best case an annotation will be enough to swap out classloaders for parts of the system. In a worse case a javaagent corresponding to your class loader might be required. In the very worst case both a Maven and Gradle plugin will be necessary.

jlink commented 4 years ago

@rohanpadhye Tried the example a few more times. Eventually I got a unique failure.

After some code reading I guess that integrating JQF with jqwik will require an agent similar to what QuickTheories is doing. Here's an excerpt from https://github.com/quicktheories/QuickTheories#coverage-guidance:

There are some disadvantages to coverage guidance. In order to measure coverage QuickTheories must attach an agent to the JVM. The agent will be active from the moment it is installed until the JVM exits - this means it may be active while non QuickTheory tests are running. This will result in an ~10% reduction in performance, and may interfere with JaCoCo and other coverage systems if they are also active.

This probably means that any kind of integration should be in an optional artefact.

vlsi commented 4 years ago

@jlink

This probably means that any kind of integration should be in an optional artefact

There might be other guidances. For instance, one might want to find an input that triggers pathological memory allocation.

Luckily, there's ThreadMXBean.getThreadAllocatedBytes that can provide that information, and it does not really need javaagent or other instrumentation. It is already present in JVM.

So I would expect that jqwik should provide pluggable points for implementation of guided fuzzing, and implementation of zest-jqwik might indeed be located in a different artifact (or even in a different repository).

jlink commented 4 years ago

@vlsi Even then. The dependency on JQF (without Zest) would be too strong IMO to put it into jqwik's core. Alternatively jqwik had to duplicate a lot of what JQF is bringing to the table. What jqwik has to offer in its core is the ability to change the stream of random bytes and to trigger some behaviour after each execution of a property. Given that I'd hope that the rest can be done as a module on top.

vlsi commented 4 years ago

The dependency on JQF (without Zest) would be too strong IMO to put it into jqwik's core

No-one suggested adding a dependency on JQF I guess :)

jlink commented 4 years ago

@vlsi To summarize our talk:

We'll together work out a first attempt at a generic GenerationGuidance interface
I'll implement it straightforwardly in a dedicated branch
We'll try it out and iterate through implementations that do not require instrumentation, e.g.
- Performance guidance: Search for long lasting program runs
- Memory allocation guidance: Search for program runs that consume more memory
- Instrument some code manually to provide coverage data, e.g. FizzBuzz

luvarqpp commented 4 years ago

Sidenote, if program does use intensive logging (at least in business part of code), it can be 1:1 match to instrumentation of branch coverage. I.e. using logger and source code+line from log events can be used in FizzBuzz example and perhaps also in real project with some level of usefulness.

jlink commented 4 years ago

@vlsi

PS. I wonder if jqwilk engine can be integrated with JUnit5's default one.

There is one easier solution to the problem of using jqwik with Jupiter: A functional interface similar to what QuickTheories is doing, e.g.

@Test
void myJupiterTest(Jqwik jqwik) {
    Arbitrary<String> strings = Arbitraries.strings();
    jqwik.property(strings).check(aString -> aString.length() >= 0);
}

If such a thing is interesting to you, please create a new issue. I will consider it for a new major version, i.e. jqwik 2.. }

vlsi commented 4 years ago

If such a thing is interesting to you, please create a new issue.

Oh. I don't think it need that yet. The question was more like it would be great if Jupiter engine could execute property-based tests as well (==so existing Jupiter integration points could be reused). I know the lifecycle is a little bit different, however, I have never really implemented a property-based testing engine, so I don't know what are the quirks there.

It is not like I can't plug a separate test engine. However, having multiple @Disabled items in the autocomplete menu is really annoying.

For instance, one can put a wrong @Disabled annotation, and it won't be considered by the engine.

jlink commented 4 years ago

For instance, one can put a wrong @Disabled annotation, and it won't be considered by the engine.

Jqwik will log a warning if you use Jupiter annotations on a property method.

jlink commented 4 years ago

here is a first draft of a potential interface for guidance implementations:

interface GenerationGuidance {

    /**
     * Returns a reference to an iterator that will deliver
     * return from the pseudo-random number generator.
     */
    Iterator<Long> guidedStream();

    /**
     * Decide if more values can be generated.
     *
     * Method could potentially block to wait for guiding algorithm to finish.
     *
     * If it returns false generation will be finished.
     */
    boolean continueGeneration();

    /**
     * Callback for observing actual generated sample passed to the property method.
     */
    void observeGeneratedSample(List<Object> sample);

    /**
     * Callback for observing the minimal shrunk sample.
     */
    void observeShrunkSample(List<Object> sample);

    /**
     * Handles the result of a property try.
     */
    void handleResult(TryResult result);

}

interface TryResult {
    enum Status {
        SATISFIED,
        FALSIFIED,
        INVALID
    }

    Optional<Throwable> throwable();
}

vlsi commented 4 years ago

    /**
     * Returns a reference to an iterator that will deliver
     * return from the pseudo-random number generator.
     */
    Iterator<Long> guidedStream();

^^ can you please elaborate a bit? I don't quite get what the method should return.

Does continueGeneration() duplicate guidedStream().hasNext()?

jlink commented 4 years ago

Does continueGeneration() duplicate guidedStream().hasNext()?

I don't think so.

guidedStream() should return an iterator that will continually deliver long values to feed sample generation. That might be only 1 (or even none) per try or many - depending on how much generation is being done under the hood. In other words, this iterator replaces the Random object.

continueGeneration() should be called once per try to determine if generation should continue at all.

jlink commented 4 years ago

Maybe both methods could be combined into one, e.g.

Optional<Stream<Long>> nextTry();

which should return Optional.empty() if generation is done.

vlsi commented 4 years ago

In other words, this iterator replaces the Random object

I do not see how Iterator<Long> replaces Random. Regular java.util.Random is int-based (see protected int java.util.Random#next(int bits))

Just in case, I have some progress with automatic fuzzer generation from JavaCC grammar. However, it does not leave much room for Arbitrary as JavaCC's grammar expects to operate with user-provided classes.

jlink commented 4 years ago

I do not see how Iterator<Long> replaces Random. Regular java.util.Random is int-based (see protected int java.util.Random#next(int bits))

Let's make it

Iterator<Integer> guidedStream();

then.

Just in case, I have some progress with automatic fuzzer generation from JavaCC grammar. However, it does not leave much room for Arbitrary as JavaCC's grammar expects to operate with user-provided classes.

Which probably means that there must be another layer of transformation.

vlsi commented 4 years ago

Let's make it Iterator<Integer> guidedStream();

Does that mean jqwik would turn the stream into source of randomness (e.g. into java.util.Random or something like that) somehow behind the scenes?

Which probably means that there must be another layer of transformation.

That is true. Currently it is like new SqlGenerator(new Random()).SelectStatement() => generates and returns a select statement.

jlink commented 4 years ago

Does that mean jqwik would turn the stream into source of randomness (e.g. into java.util.Random or something like that) somehow behind the scenes?

That was the idea. Would free implementations from doing it themselves.

vlsi commented 4 years ago

Would free implementations from doing it themselves

Then it is not clear why do you suggest to confine it to Iterator<Integer>? Who will call and use the result of guidedStream?

jlink commented 4 years ago

Who will call and use the result of guidedStream?

jqwik will call it, make a Random from it and generate samples the same way it does now.

vlsi commented 4 years ago

jqwik will call it, make a Random from it

So the test code would see Random or something like that. That looks good.

vlsi commented 4 years ago

Iterator<Long> guidedStream();

Would sourceOfRandomness() sound better?

boolean continueGeneration();

Would hasNext() be better? continueGeneration() sounds like dear generator, please continue generation for me.

void observeGeneratedSample(List<Object> sample);

Is it useful for the Guidance? The guidance produces Integer values, and I don't see how it could use the generated sample. The same goes for observeShrunkSample.

interface TryResult {

I guess you intended to have Status status() method there.

jlink commented 4 years ago

void observeGeneratedSample(List<Object> sample);

Is it useful for the Guidance? The guidance produces Integer values, and I don't see how it could use the generated sample.

A guidance might want to check if a different sequence leads to a change in generated values at all. A similiar method exists in JQF's Guidance interface so I deemed it useful.

jlink commented 4 years ago

Here's my current take considering all your input:

interface GenerationGuidance {

    /**
     * Returns a reference to an iterator that will deliver
     * integer values to feed the pseudo-random number generator for the next try.
     *
     * @throws IllegalStateException if there is no next try available
     */
    Iterator<Integer> nextTry();

    /**
     * Decide if another sample can be tried.
     *
     * Method could potentially block to wait for guiding algorithm to finish.
     *
     * If it returns false generation will be finished.
     */
    boolean hasNextTry();

    /**
     * Callback for observing actual generated sample passed to the property method.
     */
    void observeGeneratedSample(List<Object> sample);

    /**
     * Handles the result of a property try.
     */
    void handleResult(TryResult result);

}

interface TryResult {
    enum Status {
        SATISFIED,
        FALSIFIED,
        INVALID
    }

    Status status();

    Optional<Throwable> throwable();
}

vlsi commented 4 years ago

A guidance might want to check if a different sequence leads to a change in generated values at all.

Generated values do not always have equals/hashCode, so there's no much guidance can do to compare the values.

JQF itself does not use observeGeneratedArgs method, so I think we don't need it in the first versions.

vlsi commented 4 years ago

https://github.com/jlink/jqwik/issues/84#issuecomment-582278329 looks good, except I would skip observeGeneratedSample.

Iterator<Integer> nextTry();

It might be slightly better to have Stream<Integer> (or even java.util.stream.IntStream) there as Stream#close() would enable implementations to close the relevant resources if any.

I wonder if it makes sense to unify the result type of nextTry with String PropertyCheckResult#randomSeed(). For instance, add a common interface for seed and series of ints. It is probably not that important for guidance, however, the reported seed for guided execution should include the list of ints somehow, and the user should be able to re-execute it later.

I'm not sure, however, void handleResult(TryResult result); might belong to another interface.

In other words, something behind the lines of: A) RandomSource == nextTry + hasNextTry. It produces enough information for jqwik to instantiate a Random (or similar). It looks like interface RandomSource extends Stream<IntStream>. B) AfterTry listener with handleResult. It might be useful even without guided generation. C) GuidedGenerator == RandomSource + AfterTry combined.

Then, if the user wants to reproduce the failure (or hard-code a single test case), they might use a non-guided RandomSource which takes the list of ids from the failure case.

WDYT?

https://github.com/jlink/jqwik/blob/4eca4b4292cd97a4149cf87ce361a59e45319722/engine/src/main/java/net/jqwik/engine/properties/PropertyCheckResult.java#L32

jlink commented 4 years ago

It might be slightly better to have Stream<Integer> (or even java.util.stream.IntStream) there as Stream#close() would enable implementations to close the relevant resources if any.

As far as I know Stream in Java is always lazy so it's not possible (or at least not straightforward) to consume one integer after the other until no longer needed. Closing behaviour - when necessary - can be done when handleResult is being called.

I'm not sure, however, void handleResult(TryResult result); might belong to another interface.

Maybe, maybe not. Too much speculation for the time being. I'd rather start with a proof of concept and then learn from that.

Then, if the user wants to reproduce the failure (or hard-code a single test case), they might use a non-guided RandomSource which takes the list of ids from the failure case.

There's already a different way to reproduce a failing sample so that's covered for the time being.

jlink commented 4 years ago

So here's the current state. I removed the observe... method and renamed TryResult to TryExecutionResult because it fits better to what I'm currently doing in the code base.

interface GenerationGuidance {

    /**
     * Returns a reference to an iterator that will deliver
     * integer values to feed the pseudo-random number generator for the next try.
     *
     * @throws IllegalStateException if there is no next try available
     */
    Iterator<Integer> nextTry();

    /**
     * Decide if another sample can be tried.
     *
     * Method could potentially block to wait for guiding algorithm to finish.
     *
     * If it returns false generation will be finished.
     */
    boolean hasNextTry();

    /**
     * Handles the result of a property try.
     */
    void handleResult(TryExecutionResult result);

}

interface TryExecutionResult {
    enum Status {
        SATISFIED,
        FALSIFIED,
        INVALID
    }

    Status status();

    Optional<Throwable> throwable();
}

jlink commented 4 years ago

I'm wondering if InputStream isn't as bad as I initially thought since it is Closeable. Or maybe introduce a dedicated byte or integer based stream:

interface ByteStream implements Closeable {
   byte next();
   boolean hasNext();
}

jqwik-team / jqwik

Support coverage-guided generation #84

Relevant information

Suggested Solution