Closed tediou5 closed 3 months ago
I agree we could potentially do that, but it makes things way more complex with little benefit in my opinion. Just don't run multiple controllers with the same cache group.
@nazar-pc The biggest benefit should be that if a controller has an exception, it won't affect the whole cluster. Also the configuration will be a bit easier as you can use the default group for all of them without additional configuration.
As for the increase in complexity, it really doesn't matter because the nats kv storage provides TTL and atomic operations that we can just use.
Of course, the biggest reason is that I actually already have an implementation, so I'll see if it's needed here. If you don't think it's necessary, just close the issue.
@nazar-pc The biggest benefit should be that if a controller has an exception, it won't affect the whole cluster. Also the configuration will be a bit easier as you can use the default group for all of them without additional configuration.
The thing is that you're supposed to run multiple controllers for different cache groups if you want redundancy. Controller is managing caches and responds to queries asking what is where. So it is not just that controllers need to elect a leader, they need to cooperate actively to ensure they have the same exact view of the cluster or else piece retrieval will be buggy.
As for the increase in complexity, it really doesn't matter because the nats kv storage provides TTL and atomic operations that we can just use.
NATS KV is part of Jetstream and we only use core NATS for now and would love to switch to P2P solution eventually to avoid extra bandwidth usage (libp2p is too painful to use for that at this moment unfortunately).
Of course, the biggest reason is that I actually already have an implementation, so I'll see if it's needed here. If you don't think it's necessary, just close the issue.
If you link a branch here I can take a look, but it does feel like we'll add a substantial amount of complexity to the implementation with it.
The thing is that you're supposed to run multiple controllers for different cache groups if you want redundancy. Controller is managing caches and responds to queries asking what is where. So it is not just that controllers need to elect a leader, they need to cooperate actively to ensure they have the same exact view of the cluster or else piece retrieval will be buggy.
Oh, so that's what you're worried about. Indeed, it's very complicated if we want to implement a fully distributed system. But what I've done so far is actually quite simple, in that no elected controllers will not call maintain_caches, they just try elect. So there's no data synchronization issues.
If you link a branch here I can take a look, but it does feel like we'll add a substantial amount of complexity to the implementation with it.
Sure, I'll link a branch later.
@nazar-pc
NATS KV is part of Jetstream and we only use core NATS for now and would love to switch to P2P solution eventually to avoid extra bandwidth usage (libp2p is too painful to use for that at this moment unfortunately).
The current implementation is a bit difficult to implement for p2p, stored_offset is a bit too large, and synchronization across clients is a bit crazy.
However, if you keep the nats find_piece, you can actually switch to p2p for subsequent transfers. Also with nats-kv, the cache broadcasts its socket addr, and when the controller is elected leader and synchronization is complete, it writes the id and addr of the cache in the cluster to kv. On the client watch metadata changes, get piece first to the controller find_piece, and then take out the connection corresponding to the cache_id directly p2p access on the line. But I'm using tcp to realize, and not using libp2p.
Or you can choose some kind of algorithm to determine where the piece lands(like Consistent Hash), but I decided that this is not flexible enough and may not fit the current design.
I'm not yet convinced we need leader election, let's not do that for now, but I certainly appreciate your thinking in that direction, we may leverage it eventually.
As for P2P I was thinking about making everything P2P without NATS at all. All notifications and requests would be done directly instead of flowing through NATS server(s).
See https://github.com/subspace/subspace/pull/2659 for initial attempt that was abandoned in favor of NATS.
Now if multiple controllers set the same cache-group bad things happen.
I actually have a simple leader-selection implementation with the help of nats-kv(just run with -js to open jetStream), which ensures that there can be multiple controllers with the same cache-group, but there will only be one leader at a time, and if the current leader goes down there will be a reelection by the other controllers.
If you think this is a good idea, I think I can put together some code and make a commit.