quickCheckWithResult with low maxSuccess fails unexpectedly

robx commented 2 years ago

Contrary to expectation, lowering the maxSuccess parameter may cause tests to fail:

> let p k k' = k /= k' ==> (k :: Int) /= k'
> quickCheckWithResult stdArgs p
+++ OK, passed 100 tests; 11 discarded.
> quickCheckWithResult stdArgs{maxSuccess=2} p
+++ OK, passed 2 tests; 11 discarded.
> quickCheckWithResult stdArgs{maxSuccess=1} p
*** Gave up! Passed only 0 tests; 10 discarded tests.

The explanation seems to be due to a combination of:

discard ratio defaults to 10, which means we allow only 10 discarded results in total for the case maxSuccess=1 (maxSuccess=2 gets 20 discards upfront)
computeSize doesn't ramp up the size quickly enough, it starts out [0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,2,2,2... regardless of maxSuccess parameter

Those computed sizes in particular seem a bit weird to me, but I can't say I have any kind of understanding of most of what's going on here. It does seem a bit at odds with the comment here https://github.com/nick8325/quickcheck/blob/7ff70fe9da61c95180ec90bd1ff074597efce37d/src/Test/QuickCheck/Test.hs#L228, which would make me expect [0,1,2,....

Rewbert commented 1 year ago

If your test is successful the size grows as described in the comment you mentioned above, but if your test is discarded the size is not increased immediately. QC will try to reuse the same size for a couple of times (a magic number of 10) before it increases the size. You can see this in the computeSize function, mod n (maxSize a) + div d 10. n is the number of successful tests, and d is the number of recently discarded tests.

With maxSuccess=1 QC currently has no chance of generating any size other than 0.

MaximilianAlgehed commented 5 months ago

I was looking at this and it gets even worse... It fails with quickCheckWith stdArgs{maxSuccess=1} but succeeds with quickCheck . withMaxSuccess 1.

nick8325 / quickcheck

quickCheckWithResult with low maxSuccess fails unexpectedly #338