cortoproject / corto

A hierarchical object store for connecting realtime machine data with web applications, historians & more
https://www.corto.io
MIT License
86 stars 14 forks source link

Add cycle detector to corto #665

Closed SanderMertens closed 6 years ago

SanderMertens commented 6 years ago

Corto uses reference counting to automatically clean up memory. Naive reference counting cannot automatically cleanup cycles however, which can result in leakage.

Because corto is meant to be used in realtime, low-footprint environments, having a garbage collector run alongside corto applications is not desirable as it would make performance less predictable. For that reason, applications should take care not to introduce cycles in datastructures that are created and deleted in the mainloop.

Inevitably however, some objects will have cycles. In particular metadata inherently has cycles, as types can have forward declarations. Also, parent-child and typeof relationships can introduce cycles, where a parent keeps a list of (some of its) children. Because children increase refcount of their parent, this can easily introduce cycles.

This rarely poses issues in the mainloop, however at shutdown users may find that potentially a lot of objects are "leaking". This complicates memory leakage analysis. To make this process easier, corto should ideally find and cleanup cycles.

Because it is solely meant for analysis purposes, cycle detection only needs to be done at shutdown. That way, runtime performance is not affected. Also, cycle detection should be made optional. As detecting cycles in a large dataset might take some processing, this can significantly increase the duration of short-lived processes (like testcases). Typically, cycle detection should only be turned on when doing memory leakage analysis.