Memoization - Githubissues

edsko commented 1 year ago

We have the withoutShrinking combinator that makes generators that we want to run only once against a SampleTree. Unfortunately, we currently have to re-run those generators every single time they run, which leads to pretty bad performance; this is most notable in tests that depend on fromShrinkTree; see the TestSuite.Sanity.Functions.StringToBool test one a particularly bad example.

It feels like it should be relatively simple to memoize these results. For example, we could imagine that shrinking could return new generators, such that if we have

data Gen a where
  ..
  Once :: (SampleTree -> (a, SampleTree)) -> Gen a

then

shrink (Once f) st = first Pure $ f st

replacing the generator by a constant one that just returns the value it produces. Unfortunately, this does not work: we cannot return new generators, because generators are monadic (how would we update a continuation a -> Gen b?).

The only state we have is in the SampleTree. Now, we could of course do something like this:

data Sample = .. | NotShrunk Word64 (Maybe Dynamic)

where we cache the constructed value right in the same tree. This too would make sense, but it's unclear how this works with context re-interpretation . For example, what if we have a generator such as

example :: Gen ..
example = do
    b <- bool
    if b then once ..
         else once ..

We now run two different generators against the same sample tree; surely we don't expect to get the result from one of those generators in the other one? That would be very confusing (consider the case where one generator is prim, and the other (* 2) <$> prim or something like that; it would be very strange if we got an odd number out of that generator).

edsko commented 1 year ago

Perhaps this is not sufficient, actually. When we are memoizing a shrink tree, we would still be retracing our steps through that shrink tree on each iteration; we would merely avoid building the shrink tree, which might not actually save us all that much time. Ideally we would memoize the most recently generated value, so that we only need to take a single step.

edsko commented 1 year ago

This is less of a priority now that we can generate functions in an idiomatic way, not relying on shrink trees (#29).

edsko commented 1 year ago

Perhaps https://hackage.haskell.org/package/chimera can help here.

well-typed / falsify

Memoization #25