Open effectfully opened 6 years ago
@effectfully This looks like and interesting idea. I would like to see benchmarks and profiling first just to be :100: sure that this implementation is faster or at least not slower (and doesn't contain space leaks). Type of concatMapM
always was confusing to me as well, it's just a result of generalization. But I'm okay with making things simpler. Personally I just want to be able to call multiple actions on some collection, where which action returns some monoid and monoid concatenation is performed automatically. Having examples like this to work is also a good feature:
concatMapM readFile files >>= putTextLn
I would like to see benchmarks and profiling first just to be 100 sure that this implementation is faster or at least not slower
That's a lot of things to do just to prove an obvious fact that fmap fold . traverse f
is way too inefficient when run in a strict Monoid
(compared to foldMapM
) and is too restrictive (compared to foldMapA
).
(and doesn't contain space leaks)
foldMapA
eats stack of course. I checked foldMapM
some time ago and it worked alright, but yes, a proper test is needed here. If I was to write such a test, I'd use Control.Monad.Writer.CPS
and weigh
, but right now I have many other things to do.
I had a look at foldMapM
to see how evaluation would look like.
If we expand the definition a bit, and add a type sig to step
for clarity:
foldMapM
:: forall c a m b. (Container c, Element c ~ a, Monad m, Monoid b)
=> (a -> m b) -> c -> m b
foldMapM f xs = foldr step return xs mempty
where
step :: a -> (b -> m b) -> (b -> m b)
step a b_mb = \b -> do
b2 <- f a
let !res = b <> b2
b_mb res
Let's specialize to lists, where foldr
is defined as:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f acc [] = acc
foldr f acc (x:xs) = f x (foldr f acc xs)
And now try to evaluate this expression:
a :: IO (Sum Integer)
a = foldMapM (\x -> pure (Sum x)) [1, 2]
foldMapM (\x -> pure (Sum x)) [1, 2]
-- substitute `foldMapM`
foldr step return [1,2] mempty
where
step :: Int -> (Sum Int -> IO (Sum Int)) -> (Sum Int -> IO (Sum Int))
step a b_mb = \b -> do
b2 <- pure $ Sum a
let !res = b <> b2
b_mb res
-- substitute `foldr` with its 2nd equation
step 1 (foldr step return [2]) mempty
-- substitute `step`
do
b2 <- pure $ Sum 1
let !res = mempty <> b2
foldr step return [2] res
-- evaluate the first 2 statements
foldr step return [2] (Sum 1)
-- substitute `foldr` with its 2nd equation
step 2 (foldr step return []) (Sum 1)
-- substitute `step`
do
b2 <- pure $ Sum 2
let !res = (Sum 1) <> b2
foldr step return [] res
-- evaluate the first 2 statements
foldr step return [] (Sum 3)
-- substitute `foldr` with its 1st equation
return (Sum 3)
In conclusion: constant stack, doesn't create thunks, so no space leaks. It seems to have similar properties to foldl'
.
Let's see foldMapA behavior on strict and lasy monoids.
The first example will be:
foo :: (String, (Sum Int))
foo = foldMapA (\x -> pure (Sum x)) [1, 2, 3]
-- expand our definition, replacing foldMapA with it's body
foo = foldr (\a fm -> mappend <$> ((\x -> pure (Sum x)) a) <*> fm) (pure mempty) [1, 2, 3]
-- reduce (String,) applicative and Monoid functions
foo = foldr (\a fm -> (\(s1, x) (s2, y) -> (s1 <> s2, x + y)) ((\x -> ([], x)) a) fm) ([], 0) [1, 2, 3]
-- simplify lambdas
foo = foldr (\a (s2, y) -> let (s1, x) = ((\x -> ([], x)) a) in (s1 <> s2, x + y)) ([], 0) [1, 2, 3]
foo = foldr (\a (s2, y) -> ([] <> s2, a + y)) ([], 0) [1, 2, 3]
-- move lamba into `step` function
foo = foldr step ([], 0) [1, 2]
where
step a (s2, y) = ([] <> s2, a + y)
Now we can clearly see, that our step function strict on it's second argument and computation will be:
> step 1 (foldr step ([], 0) [2, 3])
> step 1 (step 2 (foldr step ([], 0) [3]))
> step 1 (step 2 (step 3 (foldr step ([], 0) [])))
> step 1 (step 2 (step 3 ([], 0)))
> step 1 (step 2 ([], 3))
> step 1 ([], 5)
> ([], 6)
What about lazy applicative, such as (-> e)?
foo :: Int -> (Sum Int)
foo = foldMapA (\x -> pure (Sum x)) [1, 2, 3]
-- definition from the first attempt
foo = foldr (\a fm -> mappend <$> ((\x -> pure (Sum x)) a) <*> fm) (pure mempty) [1, 2, 3]
-- inline Applicative definitions
foo = foldr (\a fm -> \e -> ((\x -> const x) a e) + fm e) (const 0) [1, 2, 3]
-- simplify
foo = foldr (\a fm -> \e -> ((const a) e) + fm e) (const 0) [1, 2, 3]
-- introduce `step`
foo = foldr step (const 0) [1, 2, 3]
where
step a fm = \e -> ((const a) e) + fm e
Now step
is'n strict on Applicative, but sum yet not tail recursive
step 1 (foldr step (const 0) [2, 3])
\e -> const 1 e + (foldr step (const 0) [2, 3]) e
-- now add argument
(\e -> const 1 e + (foldr step (const 0) [2, 3]) e) 4
const 1 4 + (foldr step (const 0) [2, 3]) 4
1 + (step 2 (foldr step (const 0) [3])) 4
1 + (\e -> const 2 e + (foldr step (const 0) [3]) e) 4
1 + (const 2 4 + (foldr step (const 0) [3]) 4)
1 + (2 + (foldr step (const 0) [3]) 4)
1 + (2 + (step 3 (foldr step (const 0) [])) 4)
1 + (2 + (\e -> const 3 e + (foldr step (const 0) [] e)) 4)
1 + (2 + (const 3 4 + (foldr step (const 0) [] 4)))
1 + (2 + (3 + (foldr step (const 0) [] 4)))
1 + (2 + (3 + (const 0 4)))
1 + (2 + (3 + 0))
1 + (2 + 3)
1 + 5
6
There is no reason why strict Applicative can start work with lazy monoid, so let's move to both lazy Applicative and Monoid
foo :: Int -> [Int]
foo = foldMapA (\x -> pure (Sum x)) [1, 2, 3]
foo = foldr (\a fm -> \e -> ((\x -> const [x]) a e) <> fm e) (const []) [1, 2, 3]
foo = foldr (\a fm -> \e -> ((const [a]) e) <> fm e) (const []) [1, 2, 3]
foo = foldr step (const []) [1, 2, 3]
where
step a fm = \e -> ((const [a]) e) <> fm e
Seems good to me:
step 1 (foldr step (const []) [2, 3])
\e -> const [1] e <> (foldr step (const []) [2, 3]) e
-- now add argument
(\e -> const [1] e <> (foldr step (const []) [2, 3]) e) 4
const [1] 4 <> (foldr step (const []) [2, 3]) 4
[1] <> (foldr step (const []) [2, 3]) 4
-- now guarded recursion works, next steps will not compute untill case matching/seq function
1 : (foldr step (const []) [2, 3]) 4
-- but let's immagine, that we `force` list
1 : step 2 (foldr step (const []) [3]) 4
1 : (\e -> const [2] e <> (foldr step (const []) [3]) e) 4
1 : const [2] 4 <> (foldr step (const []) [3]) 4
1 : [2] <> (foldr step (const []) [3]) 4
1 : 2 : (foldr step (const []) [3]) 4
1 : 2 : step 3 (foldr step (const []) []) 4
1 : 2 : (\e -> const [3] e <> (foldr step (const []) []) e) 4
1 : 2 : const [3] 4 <> (foldr step (const []) []) 4
1 : 2 : [3] <> (foldr step (const []) []) 4
1 : 2 : 3 : (foldr step (const []) []) 4
1 : 2 : 3 : const [] 4
1 : 2 : 3 : []
And some tests in GHCi:
-- ([Int], Sum Int)
ghci> foldr (\a fs -> mappend <$> (pure $ Sum a) <*> fs) ([], mempty) (replicate 100000000 1)
*** Exception: stack overflow
-- Int -> Sum Int
ghci> foldr (\a fs -> mappend <$> (const $ Sum a) <*> fs) (const mempty) (replicate 100000000 1) 4
Sum {getSum = *** Exception: stack overflow
-- ([Int], [Int])
ghci> foldr (\a fs -> mappend <$> (pure $ [a]) <*> fs) ([], mempty) (replicate 100000000 1)
*** Exception: stack overflow
-- Int -> [Int]
ghci> foldr (\a fs -> mappend <$> (const $ [a]) <*> fs) (const mempty) (replicate 100000000 1) 4
[1,1,1,1,1,1,1...
concatMapM
has multiple problems. For one, it has a rather confusing name for a function defined like this:There is nothing related to
M
here. Moreover, the user might thinkconcatMapM
can be used with a strict monoid likeSum
, but that would result in a space leak, because an entire container is traversed first bytraverse
and only then flattened byfold
. And the type signature is more restrictive than it should be: there is no need forTraversable
--Foldable
is enough as it's just applicativefoldMap
.So it should be defined something like this (modulo that
Container
thing):But there indeed exists a notion of monadic folding which is useful for folding into a strict monoid:
You can define all those
andM
,orM
,allM
,anyM
in terms offoldMapM
just likeand
,or
,all
,any
are defined in terms offoldMap
.