nick8325 / quickcheck

Automatic testing of Haskell programs.
Other
713 stars 119 forks source link

Need better containers instances #354

Closed treeowl closed 5 months ago

treeowl commented 1 year ago

There was a fairly prominent incident a few months ago where a package that uses containers dug into the internals ... and messed up. My conclusion is that we need QuickCheck to be able to generate multiple data structure "shapes" that represent the same abstract object. There are two basic options:

  1. Drag containers internals into QuickCheck. This is the easiest way to start, and in practice I suspect the maintenance burden won't be too bad, but it's not very clean.
  2. Have containers export functions QuickCheck can pass things to. This seems cleaner, but it requires more work up front and I'd want buy-in from QuickCheck before writing and releasing support in containers.

I'm leaning towards option 2. The idea would be for containers to export a modified and stripped down version of MonadGen (as in QuickCheck-GenT). QuickCheck would instantiate this class for Gen. The purpose of using a class would be to make things easier if containers discovers it needs more methods later.


The above mentioned incident involved Map and/or Set. I believe those are the structures the most people have dug under the hoods of, so they should be the priority. Seq is the other redundant representation in the package, so it would be good to do that too, but I don't know if anyone is doing anything weird and mission critical with it.

treeowl commented 1 year ago

There's now a blog post: https://www.314pool.com/post/cardano-post-mortem-1

phadej commented 1 year ago

I don't know what to think.

I hope that we agree that for external usage of containers API the current instances are sufficient. We shouldn't be overly paranoiac and should be able rely on containers generally work.

Folks tinkering with containers (or for that matter any opaque data type in other package) may need different generators, and that complexity of containers internals may stay in containers (satellite package). In other words, instead of containers exposing things to QuickCheck to use, containers (or some separate package, as the need is small - that one could depend on QuickCheck directly for convenience) can expose things for test-suites of tinkerers with internals to use directly.

treeowl commented 1 year ago

@phadej I agree the situation is a bit tricky. Your approach is certainly reasonable, and may be the best one. The main downside is that it relies on users who dig into internals to use special generators. That's slightly inconvenient, and I imagine it would be easy to forget to do, but it's probably not a terribly high hurdle. The other theoretical challenge is that the auxiliary package (containers-quickcheck-extras?) would need to be updated if containers ever changed the balance limits. I believe that's only happened once before (as part of Milan Straka's research), and I have no reason to suspect it'll happen again, so it's probably not a major concern. On the flip side, we have to consider the cost of having containers itself offer the necessary pieces to QuickCheck, which ideally would be built to support hedgehog and if possible smallcheck as well. My conjecture is that this cost is quite low, which makes the right choice somewhat non-obvious. However, there would also be a run-time cost to all users of the Arbitrary instance, which may be the nail in the coffin of that idea.


Tentative conclusion: let's go with @phadej's idea, but let's document the availability of the extras package(s) in the relevant containers internal modules. Should we also put that in the instance documentation for Arbitrary (Set a) and Arbitrary (Map k a)?

phadej commented 1 year ago

would need to be updated if

You could use it in containers-tests yourself, so there is an incentive to not forget...