Gabriella439 / foldl

Composable, streaming, and efficient left folds
BSD 3-Clause "New" or "Revised" License
158 stars 51 forks source link

On constant space #208

Open meooow25 opened 1 month ago

meooow25 commented 1 month ago

fold and foldM are defined as

https://github.com/Gabriella439/foldl/blob/473a5f002db29d78d3100930ce043158c5569a50/src/Control/Foldl.hs#L528-L544

But the description also claims that

This library provides strict left folds that stream in constant memory...

Seems to be a little misleading to me. Whether the fold is in constant memory depends greatly on whether GHC is able to optimize foldr for f into a constant-space fold, and it is not a guarantee arising out of this library. If GHC fails here, for whatever reason, it is likely to require linear memory.

For fold, I think this could be avoided by using foldl'. It is a safer bet that the implementor of Foldable f has written a constant-space foldl'. Is there a reason why foldr is used instead of foldl'?

For foldM, I don't know what can be done.


And to demonstrate the problem, we can benchmark on Data.Set. Tree-like structures often define foldr in a way that that GHC cannot optimize.

  sum 100000
    Control.Foldl.fold:    OK
      1.72 ms ±  54 μs, 6.9 MB allocated, 186 KB copied,  22 MB peak memory
    fold but using foldl': OK
      325  μs ±  13 μs,   0 B  allocated,   0 B  copied,  22 MB peak memory
Benchmark source ```hs -- Using GHC 9.6.3, -O import qualified Control.Foldl as Fold import Data.Foldable (foldl') import Data.Set (Set) import qualified Data.Set as Set import Test.Tasty.Bench main :: IO () main = defaultMain [ env (pure xs_) $ \xs -> bgroup "sum 100000" [ bench "Control.Foldl.fold" $ nf (Fold.fold Fold.sum) xs , bench "fold but using foldl'" $ nf (fold' Fold.sum) xs ] ] where xs_ = Set.fromList [1..100000] :: Set Int fold' :: Foldable f => Fold.Fold a b -> f a -> b fold' (Fold.Fold step start done) xs = done (foldl' step start xs) {-# INLINE fold' #-} ```
meooow25 commented 1 month ago

Thinking about it more, I realized that there is a more fundamental limitation than GHC optimations: the structure itself. A Set cannot be folded over without requiring space proportional to the depth of the tree, which is $O(\log n)$. The above benchmarks only show that foldl' does not allocate on the heap, but it still uses $O(\log n)$ stack space.

To reiterate, I want to draw attention to the facts that:

  1. The foldl library cannot make accurate claims about the space required to apply a Fold. It depends on the structure to be folded over, the implementation of foldr/foldl' for that structure, and the GHC optimizations at play. Even at their most efficient, the same fold will have different requirements: Fold.fold Fold.sum (xs :: [Int]) is $O(1)$ space and Fold.fold Fold.sum (xs :: Set Int) is $O(\log n)$ space.
  2. For some structures (like Set as demonstrated), the implementation of foldl' is practically more efficient than obtaining a left fold via foldr. At the same time, I cannot think of any situation where the opposite would be true. So I believe foldl' would be a better delegate for Fold.fold rather than foldr.
Gabriella439 commented 3 days ago

Yeah, it should probably clarify that it's only guaranteed to be constant space for lists.

Also, if I'm being really pedantic, it's only guaranteed to be constant space if all folds involved (including user-defined Folds) use an accumulator that is internally strict. This property holds for the existing Folds provided by the foldl package but might not necessarily hold for user-defined Folds.