DataDog / saluki

An experimental toolkit for building telemetry data planes in Rust.
Apache License 2.0
10 stars 1 forks source link

Support mutation of tags in a memory efficient way. #127

Open tobz opened 1 month ago

tobz commented 1 month ago

Context

Currently, we have support for "resolved" contexts -- a fixed name and set of tags that are cached and shareable -- as well as mutating the tags of those contexts. When tags are mutated, we create a copy of the context to allow the original to continue to exist and be shared, while the copy gets mutated.

In practice, there are two issues with the current approach for allowing mutation:

Essentially, we want to try and find a way to minimize the allocations, and size of the allocations, required to support allowing mutation of an existing context.

tobz commented 1 month ago

Tags and their memory layout/footprint

Overall, cloning a context ends up with two allocations: the ContextInner, which is wrapped in an Arc<T>, and so must be cloned to go into its own backing allocation, ands then the allocation for TagSet, which is a newtype wrapper over Vec<Tag>.

Tags themselves are 40 bytes along for Tag, so even with just a few tags, the backing allocation for Vec<Tag> might jump up to a decently large size class (i.e. seven tags would be 280 bytes, so we might get a 512 byte chunk for our backing allocation).

Adding tags vs removing tags

There are some potential optimizations we could do but they depend specifically on how the tags are modified.

If we only ever needed to add tags, then we could consider an approach that looks more like an intrusive list, where we simply deal with linking one TagSet to another, so that we could have "fixed" tag sets (such as the initial one created when decoding the metric) and then another that gets used for adding the metrics to. In this way, we would only pay for the tags we add after-the-fact, rather than also cloning the tags that we don't end up modifying.

Naturally, though, this only works if we can only add tags. If we have to support removing tags, that could mean having to modify existing tag sets, and we're back to having to clone them entirely.