Open ndragazis opened 2 days ago
The Seastar framework seemlessly supports running a sharded HTTP server via the http_server_control
class. This creates a distributed http_server
object underneath.
Making the KVStore
service sharded presents two challenges:
seastar::thread
, so future::get()
does not block. This means that KVStore::{start, stop}
can no longer be called from the ctor/dtor and they must be called separately. Essentially, I have to revert https://github.com/ndragazis/tinykv/commit/8e4bb566a6ce7a5e646637186a059883e3844b61.seastar::sharded
class that we use to make a service sharded, automatically detects and calls the service's stop()
function when we call seastar::sharded::stop()
, if it exists. So, we either have to make KVStore::stop()
idempotent (should be a no-op when called more than once), or do not call it from the application code explicitly.There are two common approaches to partition the data:
Partition by key range is preferred when one wants to support range queries, but it's hard to partition the data uniformly. Partition by hash range provides better data distribution.
I will go with hash range partitioning for the following reasons:
There is an example with hash range partitioning in Seastar's codebase in memcached.cc
.
In the current single-sharded design, the file structure is the following:
.tinykv/
├── sstable_1
├── sstable_2
├── wal
└── wal_1
To make the kv store multi-sharded, each shard should have its own WALs and SSTables. The new file structure will look like this:
.tinykv
├── shard_0
│ ├── sstable_1
│ ├── sstable_2
│ ├── sstable_3
│ └── wal
├── shard_1
│ └── wal
├── shard_2
│ ├── sstable_1
│ └── wal
├── shard_3
│ └── wal
├── shard_4
│ └── wal
├── shard_5
│ ├── sstable_1
│ └── wal
├── shard_6
│ └── wal
└── shard_7
└── wal
I have completed the transition from synchronous to asynchronous code with Seastar (see #5). Opening this issue to explore how we can transition to a sharded architecture, whereby each core will be exclusively managing a key range.
Related content: