sageserpent-open / americium

Generation of test case data for Scala and Java, in the spirit of QuickCheck. When your test fails, it gives you a minimised failing test case and a way of reproducing the failure immediately.
MIT License
15 stars 1 forks source link

Wrinkles in shrinkage of double values from simplest overload of `TrialsApi.doubles()`. #55

Closed sageserpent-open closed 1 year ago

sageserpent-open commented 1 year ago

Running this code in JShell:

import com.sageserpent.americium.java.TrialsScaffolding;

import static com.sageserpent.americium.java.Trials.api;
import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.MatcherAssert.assertThat;

try {
    api()
            .doubles()
            .withLimit(15)
            .supplyTo(input -> {
                try {
                    final double root = Math.sqrt(input);
                    assertThat(Double.isNaN(root), is(false));
                } catch (Throwable throwable) {
                    System.out.println(input);
                    throw throwable;
                }
            });
} catch (TrialsScaffolding.TrialException exception) {
    System.out.println(exception);
}

Yields a puzzling and slightly unsatisfactory shrinkage:

-6.22072333935824E113
-1.2492963789941435E38
-2.1313301431623778E35
-1.181626588304059E8
-5.741566311021205E-6
-1.7497869133732925E-6
-0.11276248501591546
-0.8728556367425091
-4.3952499026678193E-4
-0.8697991774298572
-0.42150067789853984
-0.1957862148473749
-0.3999181556115391
-0.41296552373673545
-0.8796029723097363
-0.32859780009970496
-0.34029833816375676
-0.677573698876847
-0.32535057820804725
-0.5490392851742205
-0.6509368524083718
-0.16381790235004345
-0.7482232581153037
-0.929138062074717
-0.34903715862394735
-0.1676747404359713
-0.7687988850965054
-0.36920904220149864
-0.6543721802136668
-0.5922205655116461
-0.08635120700615839
-0.9207081859483317
-0.4638362390624543
-0.3239997781039079
-0.9602995561899036
-0.7089307608212011
-0.6857292159305641
-0.7265782954397746
-0.7314156962376819
-0.7307886238322471
-0.7306094602878371
-0.7308781907032909
-0.730967787376657
Trial exception with underlying cause:
java.lang.AssertionError: 
Expected: is <false>
     but: was <true>
Case:
-0.730967787376657

Note that there are lots of values much closer to zero, including -1.7497869133732925E-6 which would be much better than the final shrunk value of -0.730967787376657, but these fell out of the running.

It's not a disaster, as the final shrunk value compares very favourably with the initial failing case of -6.22072333935824E113, but it's not great. If one were to be working on a mathematical utility library, this wouldn't be acceptable at all.

sageserpent-open commented 1 year ago

Changing the code slightly to use an explicit range:

.doubles(-1e10, 1e10) instead of .doubles() yields much better shrinkage:

-9.105215576676373E9
-8.6881081547096E9
-2.1320432384270747E9
-1.0794530527479506E9
-3.933015970732623E8
-9.183900331240854E7
-4.5931761134104E7
-3.508276029329806E7
-5021004.9676793255
-1696895.33830504
-1004189.6789330263
-571909.6722478328
-404243.60335425945
-93369.2146621451
-83588.7870024568
-13330.019412140715
-8695.34222429553
-6065.403802764423
-1166.2699171308914
-1127.3063430503917
-303.38285352351534
-188.64786072462712
-72.06401439886594
-13.348024122637748
-7.339791923115929
-4.9339153455115605
-1.2664547004419962
-0.8866716952674047
-0.35113861037577854
-0.15060597083685345
-0.05695786073692255
-0.002145267470610168
-0.002023332673281586
-9.146492157413588E-4
-6.367410938790119E-5
-3.2855662635000726E-5
-1.6126423113549393E-5
-1.0490740220969741E-5
-6.201636426617085E-6
-5.449200118912145E-6
-1.8431436932253575E-8
-7.589415207398531E-9
-2.168404344971009E-9
-1.0842021724855044E-9
Trial exception with underlying cause:
java.lang.AssertionError: 
Expected: is <false>
     but: was <true>
Case:
-1.0842021724855044E-9
sageserpent-open commented 1 year ago

This is caused by the rather odd implementation for TrialsApiImplementation.doubles which builds a double test case out of a random value picked from [0, 1] that is scaled by the shrinkable input and then given a choice of sign.

The overload that takes explicit bounds uses a more conventional implementation - this would be obvious preference, but unfortunately it doesn't play well with bounds going from +/- Double.MAX_VALUE.

The best resolution would be to improve or replace the implementation with explicit bounds and then delegate from the simplest overload of TrialsApiImplementation.doubles to that one.

sageserpent-open commented 1 year ago

This may require a redesign of case factories, one possibility is to dispense with the notion of a long input domain with accompanying bounds and shrinkage target in the domain; instead everything would be expressed in terms of the actual test case type produced by the factory. This would mean that the interpreters for Generation would have to be cutover, and that somehow the shrinkage logic would either have to be generalised to work in terms of whatever numeric type the case factory uses, or would have to be moved into the case factory.

Having the shrinkage logic left where it is, but generalised would put a typeclass constraint on generation.Factory, and perhaps that is a good thing.

Putting the shrinkage logic in CaseFactory- presumably using a factored implementation to capture the existing code - would allow the typeclass constraint to be hidden, and would also permit other kinds of shrinkage mechanisms to be employed.

Need to do some spike work on this, just cutting over the existing case factory / shrinkage mechanism...

sageserpent-open commented 1 year ago

Got a working solution in commit SHA: deb41ad5e9b3070a69e386688f8237198609dbe1 .

Now using BigInt as the domain type for CaseFactory (which has been split off as a Scala API version, the original Java API version uses BigInteger, hence the need for the split to refer to each language's view of big integers).

Will do some benchmarking and review ...

sageserpent-open commented 1 year ago

Could be worth revisiting some of the dropped requirements in #52, now we have the ability to deal with very wide ranges for numeric test cases. As long as explicit bounds are provided, it should be easy to cover BigInt(eger) and BigDecimal...

sageserpent-open commented 1 year ago

Now seeing a satisfactory shrinkage of:

-1.6368383533632535E308
-1.561855238466275E308
-5.724203321190871E301
-4.328389007965613E295
-2.3553297620464112E289
-8.214037006639844E282
-6.135445097847607E276
-6.998906892186612E270
-1.4959965581367586E264
-2.7235432873548046E258
-1.9337226810726094E252
-2.8560687737264523E245
-2.5242527677406183E239
-4.09741208661987E233
-3.0264548008995325E227
-8.121162308584376E220
-8.57309011256221E214
-2.95734111685967E208
-4.508883161621542E202
-2.4318361538016546E196
-5.791893305345205E189
-5.642608694563882E183
-5.878363756854301E177
-1.6881043720027839E171
-2.436944593273706E165
-9.794859700653345E158
-9.096253515426113E152
-5.189584714439396E146
-1.4356038707458457E140
-1.1789757418712668E134
-1.183630910548621E128
-4.5375115978691015E121
-4.744543319815227E115
-2.8061678260420168E109
-1.7975489525188147E103
-1.0153030580569114E97
-5.711195313886208E89
-2.652522977499653E84
-6.280882465551535E77
-3.0428916958289433E71
-7.16942015591768E65
-5.828349926832568E59
-1.7374118801065623E53
-1.585871837303352E47
-2.939122213851453E40
-8.155392010528042E34
-1.8793748515705658E27
-2.4232777974076443E22
-1.6615756857031942E16
-1.53433511917E9
-7542.59
-7231.54
-1891.75
-437.63
-15.35
-0.72
-0.1
-0.08
-0.01
Trial exception with underlying cause:
java.lang.AssertionError: 
Expected: is <false>
     but: was <true>
Case:
-0.01

The new approach uses the same mechanism for all three overloads of TrialsApi.doubles, but tries to size the input domain to have at least the integer interval [Long.MaxValue, Long.MinValue] but will increase it if necessary ensure that at least a precision of 1e-2 is covered by each subdivision of the input domain. The target value is still represented accurately as per the implementation prior to this ticket.

This approach yields a decent enough shrinkage if we start with doubles from right over the entire range, but allows much finer precision if explicit bounds that are closer together are used; so the example given above using .doubles(-1e10, 1e10) still yields the same good shrinkage down to -1.0842021724855044E-9.

sageserpent-open commented 1 year ago

This went out in release 1.10.3, Git commit SHA: 977755a004c5f21a293482b5ec2eb961b0e26cde .