sageserpent-open / americium

Generation of test case data for Scala and Java, in the spirit of QuickCheck. When your test fails, it gives you a minimised failing test case and a way of reproducing the failure immediately.
MIT License
15 stars 1 forks source link

Improve shrinkage for the `lengthList` challenge. #43

Closed sageserpent-open closed 2 years ago

sageserpent-open commented 2 years ago

There is a repository: https://github.com/jlink/shrinking-challenge of code challenges for various property based testing frameworks, where the idea is to find an optimal shrinkage.

Americium has a few submissions to that repository, and there is a fork where it takes on the lengthList challenge:

  1. Description of challenge: https://github.com/sageserpent-open/shrinking-challenge/blob/main/challenges/lengthlist.md .
  2. Test code: https://github.com/sageserpent-open/shrinking-challenge/blob/americium/pbt-libraries/americium/src/test/java/challenges/lengthlist/LengthListTest.java .
  3. Report: https://github.com/sageserpent-open/shrinking-challenge/blob/americium/pbt-libraries/americium/reports/lengthlist.md .

As the report referenced by the third link shows, the shrinkage is suboptimal - the job of this issue is to improve shrinkage so that the obvious best solution of a list with just element 900 is found without cheating on the test code.

sageserpent-open commented 2 years ago

At first sight, this appears to be down to the interaction between value shrinkage of the size of the list case and the magnitude of the elements within the list, which have to be >= 900 to provoke the test failure. An obvious hypothesis is that as shrinkage is applied globally across all factories set up by the Trials DSL, once a failing case has been detected with, say, a list size of 2 and an element of value 900, further attempts to shrink by value that would have resulted in a list with just one element then cause the value of the element to be less than 900, so these cases fail to provoke a test failure:

Screenshot 2022-05-12 at 10 19 58 Screenshot 2022-05-12 at 10 24 27 .

sageserpent-open commented 2 years ago

Added a test to TrialsSpec to reproduce this. Interestingly, this test also deals with a mirror image problem where the signs and sense of comparison changes. The mirror image test does result in the correct shrinkage, where there is a singleton list with an element of -900.

sageserpent-open commented 2 years ago

As of commit 6a574a47ea60fa41ca1e308c3e6372b7dba7bfa2, TrialsSpec now passes.

sageserpent-open commented 2 years ago

Prior to this ticket...

Commit 0e227b420f71130b72c7bb1ec4ce641303b679e8 - takes 2 minutes 46 seconds / 2 minutes 40 seconds on the development machine to run the "be shrunk to a simple case" test TrialsSpec using Scala 2.13.8. The whole TrialsSpec takes 3 minutes 17 seconds / 3 minutes 9 seconds.

Using this ticket's code...

Commit 6a574a47ea60fa41ca1e308c3e6372b7dba7bfa2 - takes 2 minutes 41 seconds / 2 minutes 40 seconds on the development machine to run the "be shrunk to a simple case" test TrialsSpec using Scala 2.13.8. The whole TrialsSpec takes 3 minutes 24 seconds / 3 minutes 23 seconds. That includes the additional test for this ticket that takes 12 seconds / 12 seconds.

So performance is essentially unchanged.

sageserpent-open commented 2 years ago

This has gone out in version 1.4.3, Git SHA da8c524d31fe6d66 .

sageserpent-open commented 2 years ago

Evidence from the original lengthList challenge, running against a local 1.4.4-SNAPSHOT of americium: Screenshot 2022-06-12 at 15 37 18

Screenshot 2022-06-12 at 15 40 14

Now we have a list with a single element of 900.

sageserpent-open commented 2 years ago

Leaving this ticket open as Sonatype reports some potential vulnerabilities via the Guava and Jawn dependencies. These are benign in this context, but it would be good to update the dependencies to get a clean bill of health once more.

sageserpent-open commented 2 years ago

Release 1.4.4 with updated Guava and Jawn (via Circe) dependencies to try to placate Sonatype - still have a vulnerability reported for Guava even at guava:31.1-jre. It's not even a part of the library that is used by Amercium, grrr.

sageserpent-open commented 2 years ago

Getting a clean bill of health is clearly going to require some effort (or patience). Closing this ticket in the meantime...