elm-community / elm-test

moved to elm-explorations/test
https://github.com/elm-explorations/test
BSD 3-Clause "New" or "Revised" License
340 stars 35 forks source link

Should Fuzz.list make fewer elements? #168

Open rtfeldman opened 7 years ago

rtfeldman commented 7 years ago

Currently Fuzz.list can generate lists with hundreds of elements. Tests which use Fuzz.list passing a fuzzer which also uses Fuzz.list (e.g. because it's a fuzzed record that has a list inside it) can take a lot longer to run, sometimes so long they appear not to terminate. Multiplication! 😅

Is generating long lists the right default?

I can't think of any situations where fuzzing long lists found a bug that fuzzing short lists (0-3 elements) wouldn't have found as well. (The same is also true for huge numbers and huge strings, but those are cheaper to generate.)

If Fuzz.list generated lists between lengths 0 and 3, I think it would have caught as many bugs for me as the status quo list, except it would have taken less time to execute. I'm wondering if that's the right default, and we should have a Fuzz.longList alternative if you know having long lists matters.

mgold commented 7 years ago

I think it should generate 10-20 elements some of the time, but hundreds seems excessive.

rtfeldman commented 7 years ago

The top end matters a lot when you get multiplication involved.

For example, let's say you have a Fuzzer FeedEntry and FeedEntry has a List Comment inside it which you're also fuzzing. Now if you do Fuzzer (List FeedEntry) you're getting feedEntryListLength * averageCommentListLength comments.

In that world, the difference between Fuzz.list generating 0-3 versus 10-20 is the difference between 0-9 comments and 0-400 comments.

Put another way, increasing the upper limit has a magnified effect with compound fuzzers. How sure are we that 10-20 is more beneficial as a default than 0-3?

rtfeldman commented 7 years ago

The counterpoint to this is that maybe you have a list that typically will have a dozen elements in there, and now your only options are (1) only test the 0-3 case, which is atypical, (2) use Fuzz.longList which is absurdly longer than necessary, (3) roll a custom list.

The trouble I'm having is - even if it's typical for the list to have a dozen elements, what are the bugs you find by fuzzing all 12 elements versus just 3? I'm struggling to think of a non-contrived example of where this would find a bug.

jwoudenberg commented 7 years ago

Probably complicated and possibly a bad idea, but what if we give pass along a credit while generating? Say a generator gets to spend a hundred points and each call of a primitive fuzzer (implemented using custom) costs one point. Generating a list now costs points equal to the cost of generating each item in the list. And so simply said we can generate either long lists or deep structures but not both.

One problem with this is that the amount of credits must be dynamic: A more complex structure will require a larger credit or there might not be enough to generate the structure in the first place.

gampleman commented 7 years ago

Pagination only displays if you have 10 comments or more?

Along the notion of credit is a notion of a size parameter that generators take which controls the relative size of their output. Some quickcheck style libraries' generators scale this parameter up as they progress through the test, as to find the simple failures quickly on inputs that are small and cheap to generate and progress to larger and more complex output as the test progresses.

Typically then recursive generators (i.e. generators that call other generators, like a list generator for example) would decrement the size parameter in some way when passing it on to it's child calls. This helps solve the multiplication problem, however it does makes generating small lists with large sublists very unlikely.

zkessin commented 7 years ago

John Hughes and I had a talk about this (In erlang) at a conference a while back and the long list does have a value. Take the case of a bug that only happens if an item happens to be in a list twice. Having a longer list greatly increases the chances that this will happen.

To be fair some of these are contrived. But if you start doing things like fuzzing sequences of events then having a very long list will make it much more likely to turn up a strange bug

zkessin commented 7 years ago

maybe have a version of the list fuzzer that lets you give a length or range. Fuzz.list 10 100 integer

mgold commented 7 years ago

I don't see much value in a minimum length. Fuzz.list {maxLength = 20} integer? The downside of course being that you have to pick a maximum length.

jwoudenberg commented 7 years ago

I just realised another benefit of the credit idea talking with Brian: It would make it very easy to write fuzzers for recursive data structures. Currently you have to actively worry about them going on forever, but with a fuzzer credit we might be able to prevent that from happening in the library.

mgold commented 7 years ago

Interesting. It sounds like credits will be easier to think about when the dust settles from removing opt-in rosetrees (not committing to either of those, just saying it's a possibility). So maybe we should table this for the moment.

zkessin commented 7 years ago

In erlang there is an explicit size parameter used in generators, and it starts small then ramps up, but you can also use it for recursive generators so that you don't get infinite data structures.