Track memory usage of batches and spine more efficiently

ryzhyk commented 1 week ago

Currently, we use the SizeOf trait to estimate heap usage of batches and traces. The default implementation of this trait scans the entire batch and calls SizeOf methods recursively on each key and value (this sometimes gets optimized away if all keys/values have a size known at compile time, but in general we cannot count on this).

This means that we can't sample heap usage frequently, as it becomes very expensive.

Note that this is only a problem for in-memory batches. Tracking disk usage of persistent batches is presumably trivial, as these are just contiguous files.

Let's collect some ideas for fixing this under this issue.

The first thing that comes to mind is, instead of computing the heap size on demand, pre-compute it while constructing the batch, and store it as metadata with the batch. This still has a cost proportional to the size of the batch (since we must call SizeOf on each key and value while building the batch), but we only pay it once, while constructing the batch, plus this avoids the need for an additional scan.
A variation of the previous approach is to compute heap profile of a batch on demand, but cache it inside the batch, so that the next time it is requested, we can just return the previously computed value. This will help a lot with large batches that sit inside a trace for a long time.
Sampling-based approach. We don't need the heap profile to be perfectly accurate, in fact even the current implementation of SizeOf is an approximation, since we don't know exactly how much padding and metadata the memory allocator adds to each allocation. So instead of precisely measuring the size of each key and value in a batch, we can sample, e.g., 1% of values, compute the average, and multiply by the number of values in the batch. This can be combined with either 1 or 2.

@blp , @gz , @abhizer

gz commented 1 week ago

I wouldn't expect it to add a lot of overhead if you measure during constructing of the badge then cache (you already have to memcpy the thing anyways)?

size-of for file-based batches should already look at the file-size (I believe) otherwise it shouldn't be too hard to fix.

ryzhyk commented 1 week ago

I wouldn't expect it to add a lot of overhead if you measure during constructing of the badge then cache (you already have to memcpy the thing anyways)?

We can start with measuring this. Yes, we touch all the data anyway, and today we'll make a deep clone, so this additional cost may be low.

feldera / feldera

Track memory usage of batches and spine more efficiently #2462