eclipse-cyclonedds / cyclonedds

Eclipse Cyclone DDS project
https://projects.eclipse.org/projects/iot.cyclonedds
Other
845 stars 350 forks source link

Question about garbage collector and memory management #1417

Closed JerrySciangula closed 1 year ago

JerrySciangula commented 1 year ago

Hi everybody,

I would like to get a better grasp about the need/benefits of using a garbage collector.

Unfortunately, I could not find any on-line document illustrating the rationale of this design decision. I also wonder if there was some prior study about pros/cons of this approach for memory management, especially with respect to introduced latency/overhead.

Could you please help by providing some clue about:

      1) What are the main factors that inspired a design using garbage collector?       2) Which are the pros (and maybe the cons) with respect to memory management based on different approaches (e.g. reference counting)?

Many thanks

eboasson commented 1 year ago

Hi @JerrySciangula,

First off, the GC in Cyclone is a bit of a special case: it is more of a garbage collector like you find them in real life, where people put the garbage out on the street which then gets collected by the city, then it is like the customary garbage collectors in computing. Blame my lack of imagination for reusing the term ...

In the original design, which was really for a API-less networking service rather than a full-fledged DDS implementation, the aim was to address two unrelated aspects:

The first is pretty self-evident and a classic way of dealing with that is to queue anything to be freed and let a background thread do the actual cleaning up of complex data structures. All it requires is knowing when to queue it.

That's where the second part comes in. Here, you have data coming in from multiple sources and timers expiring. One way to deal with this is to have a global lock (the first prototype way back when actually had one lock and exactly one data structure: an AVL tree 😁) but that is bound to result in a heavy contention under load (you're looking at millions of messages per second here).

A particularly common case is where you have a some data to process and it contains a GUID. So in that common case you look up the GUID in some table, discard the data if you can't find it or process it if you can. A classic way of doing that is having a hash table with mutex around it and reference counting on the objects in the hash table to track when you can actually free the object once it has been removed it from the table.

The problem here is that reference counting in C is something that I very intensely dislike — I hate it, esp. after having had to deal with it a lot in the past. That's not to say I refuse to touch reference counting in general, but I do try to avoid it for tracking dynamic references (as opposed to counting references in data structures where there really is shared ownership — e.g., all readers and writers hold a reference to a topic, and that is with a reference count on the topic). Besides, atomic reference counting has a fairly high overhead, or at least it had a decade+ ago when this particular decision was made.

The other bit that bothers me about reference counting is that you always incur a bit of cost just to handle the exceptional case that you know will occur exactly once ... I find it interesting how the actual functionality tends to be straightforward, that settings things up correctly tends to be hard, and that cleaning up correctly is by far the most difficult bit. Disciplined reference counting tends to simplify the cleaning up, but with non-negligible cost at run-time and a lot of overhead when writing code (in C, that is).

So in Cyclone I decided to try to eliminate as much as possible the cost in the common path. In consequence, there's a hash table for the lookups and no reference count to touch. The price is that you need a means of finding out when no thread is still holding a reference. To that end:

The procedure for deleting an entity is therefore (in order) to remove it from the hash table, gather the virtual clocks for all the threads and queue the pair for the garbage collector, which then postpones freeing it until all threads have made sufficient progress. In my experience with Cyclone and the code that makes me dislike having reference counting all over the place, these few simple rules result in vastly fewer bugs and with nearly all the cost incurred in a background thread. (You do have to scan the thread table when deleting something, that's the run-time cost downside, but still only incurred when deleting it.)

A nice thing is that this also allows using a lockless hash table. Once you can afford to have those, you get other interesting design options. Like how Cyclone in the C API guarantees that the "instance handles" are the same for all readers and writers of topics with the same key type (a useful property for computing natural joins): there's no way I would have dared to put such a hot (mostly reads, not so many writes) data structure touched by all threads behind a lock ...

And of course having threads ticking through these states and it clearly not being a good idea to ever allow a thread to remain "awake" for long, you have all the machinery in place to do liveliness monitoring and deadlock detection. That's disabled by default, but all it takes a single option and it starts monitoring and dumping stack traces whenever something appears stuck. It has helped find a few bugs quickly.

If it had originally been designed to have an API, it is quite possible it wouldn't have gone down this path, because tracking application threads is such a pain ... but we wanted to reuse the (proven, by then) network stack. Today that puts us in a pretty decent spot where we have a lot of freedom and stuff generally just works.

Those are roughly the considerations ... If anything is unclear (or you just want to know more) feel free to ask.

JerrySciangula commented 1 year ago

Hi @eboasson,

thank you very much for your exhaustive reply! Now, I can figure out why you decided to use such mechanism. Thank you again!