sageserpent-open / americium

Generation of test case data for Scala and Java, in the spirit of QuickCheck. When your test fails, it gives you a minimised failing test case and a way of reproducing the failure immediately.
MIT License
15 stars 1 forks source link

Enhance the streaming factory master method to allow limited ranges of driver values. #28

Closed sageserpent-open closed 2 years ago

sageserpent-open commented 2 years ago

Currently the Trials.stream method in both its Scala and Java incarnations employs a factory function that transforms a long value to a case. The values of the long parameter can range right over the allowed value set for longs, namely from Long.MIN_VALUE up to and including Long.MAX_VALUE.

What we want here is to specify tighter ranges, so that we can cover cases that range over much smaller finite sets than long or double cases.

As an example, writing a factory for byte cases would at time of writing require right-shifting with sign extension a long down to a byte. This would work, but makes shrinkage very clumsy, not least because the shrinkage is applied globally, and it wouldn't take much shrinkage to force byte values to become zeroes - so if we wrote a complex trials by combining a byte streaming trials with other streaming trials, this would find it hard to shrink the contributions from all but the byte trials.

It also makes the logic in a factory function easier to understand, as we expect to get a more natural mapping from the domain long values to the image cases.

We will keep the default definitions of Trials.stream and add overloads / optional parameters that provide a lower-bound, upper-bound and a value denoting the 'minimal case' that does not have to be either zero or even the midpoint of the range (but obviously must be within the range).

sageserpent-open commented 2 years ago

This opens up the possibility of shrinking over an indexed set of choices: by looking up the choice using the long value constrained to the range [0, one less than the number of choices] and using zero to denote the minimal case, then we get shrinkage of finite choices for free.

sageserpent-open commented 2 years ago

Maybe shrinkage will have to be generalised so that multiple underlying streaming trials can be shrunk at independent rates; unless the existing global shrinkage can be made to respect the individual ranges by scaling the shrinkage factor, we will end up prematurely shrinking a trials over a small range when shrinking a complex trials with many contributors.

This was the primary driver for this story in the first place, so we need to be careful here.

sageserpent-open commented 2 years ago

Note sure if this should go in here, but this work is an obvious springboard for implementing distributions in the factory function. Whether this should be left for users to hand-roll their own factory functions, or should be somehow packaged up in the API is open to debate.

I'm inclined to leave it as an option for power users to write their own streaming factories that implement a custom distribution.

sageserpent-open commented 2 years ago

A quick win would be to overload the existing canned factory methods to accept a parameter that specifies the range of the cases yielded by the trials instance in terms of the cases - this is much more convenient. Hedgehog has a Range abstraction for this purpose that is a worthy source of inspiration.

sageserpent-open commented 2 years ago

The streaming factory method in TrialsApi has been cut over to use a CaseFactory - that allows a restricted domain of input values to the factory to be specified, along with the notion of a maximally shrunk input that generates the minimal case. As shrinkage takes place, the inputs tend towards this maximally shrunk input.

The old behaviour is still catered for with TrialsApi.streamLegacy - this takes a function that ranges over the entire domain of Long values and adopts the convention that zero denotes the minimal case.

It is possible to use degenerate factories where the maximally shrunk input lies at the lower or upper bound of the input domain, also domains with just a single input value are supported.

sageserpent-open commented 2 years ago

Published in release 0.1.25, Git commit 5ea1b3088adaaa0270a944ee1694950975b2b911 .