jberryman / unagi-chan

A haskell library implementing fast and scalable concurrent queues for x86, with a Chan-like API
BSD 3-Clause "New" or "Revised" License
127 stars 15 forks source link

Memory consumption of bounded Chan #26

Open qrilka opened 6 years ago

qrilka commented 6 years ago

As I understand unagi-chan create segments with size of queue size which is passed to newChan function. And that means that even with fast readers and almost empty queue we could easily have up to size chan elements in memory which couldn't get garbage collected. Is there some good way to have large queue size and not waste a lot of memory? Or maybe I read code incorrectly and there is some other explanation for https://gist.github.com/qrilka/1f7cd7c2254b0e267de77d3c61542d43 to consume significant amount of RAM.

jberryman commented 6 years ago

Hi qrilka, thanks for reporting.

Yes, I think your understanding is correct; a successful read operation does not write to the array segment to remove its element, so while the segment persists, those values will not be GC'd. See: https://github.com/jberryman/unagi-chan/blob/master/src/Control/Concurrent/Chan/Unagi/Bounded/Internal.hs#L259

This means that every major GC you pay the cost of copying however many elements are in the segment at that time. If you expect high throughput then this cost may be be less than the latency/memory pressure from having done a write for every read. Furthermore you have the additional memory residency of the retained elements (how much of a problem this is depends on their size, of course).

The first issue might be solved using the new "compact regions" feature of GHC so that elements are only traversed once... but then you always have to pay for exactly one "manual GC" on write. It's difficult to design a benchmark to be able to make these sorts of decisions without having realistic use-cases.

One thing we can certainly do is document this better and add another read function that clears out its cell. When I have time I also want to try rewriting Bounded to use a proper ring buffer in #20 which would also resolve this.

jberryman commented 6 years ago

One thing we can certainly do is document this better and add another read function that clears out its cell.

Oh actually this makes dupChan impossible, so we'd need a new chan type with the new (or both new and old) read behaviors, and no dupChan.

Feel free to give it a go if you want, and I'll try to get it merged promptly; I probably won't be able to work on this myself in the near future.

qrilka commented 6 years ago

Actually we've followed katip authors and switched to TBQueue from stm. BTW with this queue organization are words in haddock of dupChan:

... may fall arbitrarily behind.

still correct?

jberryman commented 6 years ago

I might not understand what you're asking, but yes that comment is correct.

qrilka commented 6 years ago

The question is about the moment when particular segment (for some duped chan) get freed - I couldn't yet figure out the logic for that.

jberryman commented 6 years ago

Oh okay, so all of the variants so far use a linked list of array segments. These are freed by the garbage collector. An OutChan is a counter + a pointer to a particular segment, so the segments (and the "elements" they point to) that cannot be collected are all of the ones that are "held" by the slowest reader. Just as if you forked several threads that were folding over the same list.

A dupChan is a trivial operation, it just creates a copy of the read counter and returns it in an OutChan with the same pointer to the head segment.

domenkozar commented 9 months ago

I've also noticed that it allocates memory when the queue is created, not containing any elements at all.