spcfox commented 4 days ago

Steps to Reproduce

After pattern-matching on lazy values, and then using the binded name, they are computed again.

f : Lazy Bool -> Bool
f x@True  = x
f x@False = x

g : Lazy Bool -> Bool
g (Delay x@True)  = x
g (Delay x@False) = x

h : Lazy Bool -> Bool
h x@(Delay True)  = x
h x@(Delay False) = x

main : IO ()
main = do
  ignore $ pure $ f $ trace "f" True
  ignore $ pure $ g $ trace "g" True
  ignore $ pure $ h $ trace "h" True

{-
  f
  f
  g
  h
  h
-}

g computes the lazy value only once, and h computes the lazy value twice. But in practice we usually write as in f, which behaves like h, which leads to performance issue.

A similar problem occurs when using case:

f : Lazy Bool -> Bool
f x = case x of
  True  => x
  False => x

g : Lazy Bool -> Bool
g (Delay x) = case x of
  True  => x
  False => x

main : IO ()
main = do
  ignore $ pure $ f $ trace "f" True
  ignore $ pure $ g $ trace "g" True

{-
  f
  f
  g
-}

Problem with `filter`

This becomes a major problem in polymorphic functions when we have no way to explicitly force the evaluation.

An example of such a function is filter. It behaves similarly to the functions above: it evaluates a value, and in a result uses the name binded with the original argument. Therefore, it is inefficient to filter lazy values. Instead of filter (p . force) it is more efficient to do map delay . filter p . map force. Here are measurements demonstrating this problem:

iterate : Nat -> (a -> a) -> a -> a
iterate 0     f = id
iterate (S n) f = iterate n f . f

true : a -> Bool
true _ = True

runWithTimer : String -> IO a -> IO ()
runWithTimer title f = do
  start <- clockTime UTC
  ignore f
  end <- clockTime UTC
  putStrLn $ "\{title}: \{showTime 0 2 $ timeDifference end start}"

filterLazy : Functor f => ((a -> Bool) -> f a -> f a) -> (a -> Bool) -> f (Lazy a) -> f (Lazy a)
filterLazy filt p = map delay . filt p . map force

main : IO ()
main = do
  let iterations = 100
      complexity = 10000000
      value : Lazy _ = [0..complexity]
  runWithTimer "Fast Maybe"      $ pure $ iterate iterations (filterLazy filter true)   $ pure {f=Maybe}    value
  runWithTimer "Slow Maybe"      $ pure $ iterate iterations (filter $ true . force)    $ pure {f=Maybe}    value
  runWithTimer "Fast List"       $ pure $ iterate iterations (filterLazy filter true)   $ pure {f=List}     value
  runWithTimer "Slow List"       $ pure $ iterate iterations (filter $ true . force)    $ pure {f=List}     value
  runWithTimer "Fast SnocList"   $ pure $ iterate iterations (filterLazy filter true)   $ pure {f=SnocList} value
  runWithTimer "Slow SnocList"   $ pure $ iterate iterations (filter $ true . force)    $ pure {f=SnocList} value
  runWithTimer "Fast SnocListTR" $ pure $ iterate iterations (filterLazy filterTR true) $ pure {f=SnocList} value
  runWithTimer "Slow SnocListTR" $ pure $ iterate iterations (filterTR $ true . force)  $ pure {f=SnocList} value

Lines with filter compute the lazy value iterations times, while lines with filterLazy only 1 time:

Fast Maybe: 0.83s
Slow Maybe: 80.14s
Fast List: 0.94s
Slow List: 79.54s
Fast SnocList: 0.97s
Slow SnocList: 79.86s
Fast SnocListTR: 0.96s
Slow SnocListTR: 86.21s

I think it's currently impossible to write a polymorphic filter that handles lazy values efficiently.

Expected Behavior

Lazy values should not be evaluates again where the result is already known
f should be equivalent to g (maybe h too?).
filter (p . force) should behave like map delay . filter p . map force

Observed Behavior

Multiple evaluations of lazy values
g is more effective than f
map delay . filter p . map force is more effective than filter (p . force)

dunhamsteve commented 3 days ago

Have you tried with the --directive lazy=weakMemo compilation directive? (documented here) PR #2791 added weak memoization of lazy values, but it looks like it's guarded by that compilation directive because it caused performance regression in some cases.

spcfox commented 3 days ago

Yes, memoization is a good way to combat this problem (although given that memoization is weak, I'm not sure the problem wouldn't show up in more complex examples). But I think it's an undesirable behavior, and it would be nice to fix it. I think the filter example is pretty illustrative.

idris-lang / Idris2

Multiple evaluations of lazy values #3422

Steps to Reproduce

Problem with `filter`

Expected Behavior

Observed Behavior

idris-lang / Idris2

Multiple evaluations of lazy values #3422

Steps to Reproduce

Problem with filter

Expected Behavior

Observed Behavior

Problem with `filter`