Closed allada closed 7 months ago
This one seems overly complex. The design burden and maintenance overhead of the memory store feels high. When users hit this bottleneck today, it seems it might be more efficient and safer to use a caching mechanism from a database.
The first proposed solution is only ~30 lines of code. It is extremely simple and nearly no down side.
SGTM.
Merged with #289
Here's some stats on real-world usage of Nativelink:
Since rust is not a garbage collected language it is not allowed to pack memory, so we need to deal with memory fragmentation ourselves. Normally programs don't need to think much about this but in this case we do.
Details on how memory works:
Linux allocates memory in pages, often 4k. The memory manager will pack the memory for you by slotting the returned points from malloc usually from a pool of already allocated pages. This allows code to request (lets say) 100 bytes of data and that 100 bytes may share a page with other pointers in memory.
The down side is that small allocations that live for a long time might end up keeping the entire page (4k) allocated by the kernel. It is rare for an allocator to recycle memory pages that were full at one point; this means if a page was ever full, it will release the page back to the kernel once no data is allocated in the page instead of keep trying to pack new allocations into it.
The problem
Looking at the graph above, you can see that roughly 50% of all of the CAS is ~128bytes or less. This means that the vast majority of items are super tiny and only use a little bit of memory. Since memory store frequently evicts items that have not been used recently from the store, it means that it's very common for an item to be put into it and shortly after to be evicted because the object is rarely used.
So, if a 4k page was allocated, then 32 x 128byte objects were loaded into it then an hour later 31 of the items were evicted, but 1 item is used very frequently, it means that we are now using 4k of memory for 128 bytes of data. Multiply this out and you can see how inefficient this can become.
Possible solutions
Some things to keep in mind
Lots of items are very small, if 50% are 128bytes and a sha256/blake3 is 40 byte, this means just copying the hash + a pointer is about +37% for the lower 50%.