Gabriella439 / foldl

Composable, streaming, and efficient left folds
BSD 3-Clause "New" or "Revised" License
160 stars 51 forks source link

Readme: about the space leak #73

Open clojj opened 8 years ago

clojj commented 8 years ago

For Haskell beginners/intermediates it would be helpful, if the issue of space leaks would be explained a little bit more. What exactly is the 'leak' (usually in other Languages one simply thinks of a memory-leak that'll never get garbagecollected, but is this the situation here ?

That this library is more effiecient by folding in 'one go' is pretty clear, but the advantage concerning the memory-consumption should be made more understandable for beginners, I think

Gabriella439 commented 8 years ago

In this context "leak" means that it does not run in constant space. If you go over the list in one pass then you never materialize more than one element of the list at a time. If you go over the list in two passes then you have to materialize the entire list in memory.

Does that answer your question? I could include a heap profile for both solutions graphically demonstrating the difference in memory usage.

clojj commented 8 years ago

"However, the runtime cannot garbage collect the list because the list is still required for the call to length" But after the pair (sum xs, length xs) is evaluated, the list can in fact be collected, right ?

Sure, a heap profile for comparison would be helpful.

PierreR commented 8 years ago

If you go over the list in one pass then you never materialize more than one element of the list at a time. If you go over the list in two passes then you have to materialize the entire list in memory.

Well well this statement seems in contradiction with the README where we have a case of one pass with space leak ;-)

I could include a heap profile for both solutions graphically demonstrating the difference in memory usage.

Maybe a simple diagram with boxes would improve the explanation better than a heap profile ?

From the README I could imagine the following conversation:

However, this solution will leak space because it goes over the list in two passes.

OK. The fact that it goes over the list in two passes is obvious enough. That's even what would happen in other languages with the same kind of code.

If you demand the result of sum the Haskell runtime will materialize the entire list.

Sure

However, the runtime cannot garbage collect the list because the list is still required for the call to length.

Wait. How does the runtime knows there might be a second call to length ?

-- > Here a little diagram might be useful. Or a few words more ?

That now goes over the list in one pass,

Sure

but will still leak space because the tuple is not strict in both fields!

-- > You will understand this last sentence immediately only if you did grasp the meaning of the last paragraph (Wait ...)

That said, I do love the README. It is very well written and valuable.

Gabriella439 commented 8 years ago

@clojj Yes, after both sum and length are evaluated the list can be garbage collected. The issue is that there is no order that you can evaluate both of them that doesn't involve materializing the entire list, so it will never run in constant space