tontinton / dbeel

A distributed thread-per-core document database
Apache License 2.0
497 stars 19 forks source link

Struggling to understand load balancing with thread-per-core shards #56

Open numinnex opened 5 months ago

numinnex commented 5 months ago

Hey,

Really neat project, reading through it felt like a breeze, it's some of the most beautiful Rust I've ever seen.

From my understanding of the thread per core architecture, one is supposed to partition load in a way, where requests can be routed to specific cores(shards) to perform the operation (for example by using the partition id as the identifier of shard that's supposed to serve the request) and only communicate via channels (no shared state). This guarantees some benefits such as lack of need for locks.

Reading through your code, I can't seem to find such router, instead there is a send_request_to_local_shards that on request received for example CreateCollection, sends message to all of the shards except the currently used one, which appears like it performs the same operation N times locally ? I wonder if is there something that I am missing, maybe some detail about glommio, I am using monoio for my project and currently looking around for some inspiration. As of right now my main source of knowledge is project Sphinx by Pekka Enberg.

tontinton commented 3 months ago

Oh I missed this issue, only now noticed it, thanks for the kind words :)

When using gossip to relay messages, I thought, why should I simply rely on gossip to eventually reach all shards when I can instead save on bandwidth and send only to one of the shards, and then relay locally using send_request_to_local_shards.

I basically did it as an optimization.

It's not perfect, what if that shard is unresponsive while others are? But for this project I've decided that's how I wanted to roll with it. Of course, this hypothesis needs to be tested in a highly scaled cluster running a bunch of load tests, which I haven't done.

Hope that answers your questions.

PS: anything Pekka Enberg makes is most likely a good source, so you're on the safe side following his methods.