Open jon-chuang opened 3 years ago
Hey @jon-chuang, thank you for opening the issue! Indeed, I'm currently working on pointer swizzling, which has caused a lot of architectural changes that I've needed to step back from a few times and model more carefully, but it's moving along, and I hope to spend a bit of effort stabilizing some of these changes through intensive testing before cutting a new release that includes them. Pointer swizzling opens up so many cool opportunities all over the place.
In general, caching is a major area I'd like to focus on over the next year. I'm curious to evaluate some trial versions of the LeanStore FIFO, and to see if there are aspects that combine nicely with techniques from w-TinyLFU or in some cases clock (since generating metadata snapshots walks the page table anyway, we may benefit from piggy-backing some work onto this iteration).
io_uring is currently only enabled with a feature flag, but over time I think I'll get rid of the rio
dependency and make sled's io_uring usage fully internal, so I can cut out a lot of the unnecessary code and tune things more for sled's specific use case. After that happens, it will also be used for scatter-gathering read fragments.
Hey @spacejam, thanks for the insightful replies. Some thoughts:
Here are some ambiguities I have about design:
So to clarify: The cost of an eviction to cold buffer (incurred for ~10% of hot pages, less if leveraging probabilistic LFU methods like TinyLFU. Let’s aim for… 1%?):
None
, the thread/async task serializes the I/O operation into a dedup hashmap, which has an SSD I/O translation layer for pageid->(fileid, offset)
and then the I/O queue. If Some(Page)
, remove the page from FIFO and the cold buffer hashmap, and reswizzle the pointer.@jon-chuang: before going too deep into implementing an eviction policy, I'd recommend writing an implementation using Caffeine's simulator. This way you can use existing traces or replay your own to compare hit rates across a variety of workloads. As the code is only the raw algorithm (e.g. no concurrency control logic), it will be faster to iterate on until you are satisfied with the tradeoffs. A thorough analysis should help show if your design ideas are robust against scans, pollution, etc before @spacejam dives into the more complex implementation that handles all the other needs of this project.
Proposed Change:
The major changes are as follows:
Lru
.Thus the major outstanding point (apart from optimistic CC) is probably lean eviction. It accounts for a 4-5x factor increase in multi-thread throughput in the leanstore paper.
Just think... Doesn't that mean possibly, 5B ops in under a minute?
Leanstore