Closed 007hacky007 closed 4 years ago
Thank you for the input!
I was actually thinking of multiple ways implementing this and went with the most independent version, but I'll implement it as ClientInterface
, no problem. As for switching clients without changing the code, I believe it's possible even with current implementation when it just calls the Client
class which is ClientInterface
compatible. As for the usage in the ShardedClient
... I believe you either want to use ShardedClient
or FailoverClient
, can't think of why whould you want to use multi-client inside another multi-client. But you are probably right, implementing ClientInterface
is probably cleaner and correct solution, so I'm going to change it.
As for the Exceptions - thats also what I was thinking about, however it's definitely not only UnavailableException
for sure. For timeout you get DeadlineExceededException
and for the connection error (i.e. host completely down or network error maybe) you get UnavailableException
. So there are at least two of them which you want retry. But I guess we can with these two and add more in the future if needed.
I'll add custom Exception for that. Good point.
Again, thank you for your input regarding etcd experiences! I'll implement per node retry counter which will precede the hold-off time penalty.
Damn, you are right about balancing part.
You can replace a Client
with a FailoverClient
, but with typed arguments or return type that won't work without implementing the ClientInterface
.
A FailoverClient
inside a ShardedClient
makes sense if you have multiple clusters (each with one FailoverClient
) and distribute access to keys in those clusters via a ShardedClient
.
I haven't seen a DeadlineExceededException
yet and I'm not really sure when it occurs, but it looks like it makes sense to retry it.
DeadlineExceededException
occurs with newly implemented timeout, when it's reached. Tested and it occurs consistently on request timeout.
By the way, is there any particular reason for client hashing in the ShardedClient
? I mean what is the advantage of sending same key to the same etcd server? I believe it has some reason, but I could not think of any. It should not make a difference which etcd node handles the request.
Etcd clustering has no sharding, it replicates all keys to all nodes. More nodes only brings more reliability, but not more performance, which is why 3 or 5 nodes are recommended (according to https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#scaling-up-etcd-clusters). Scaling beyond that only works with independent etcd clusters and client side sharding, which is what the ShardedClient
does.
But only single node is in the write mode at the time. Others are simply passing the requests to the current master, so I still can't see any gain from hashing based on key.
If the performance of a single cluster isn't enough, you have to create another independent cluster and in order to always send requests to the same cluster for the same key, client-side sharding is required.
Ah, makes sense now! I did not think in the scale of multiple clusters. I was thinking of just multiple etcd nodes.
With multiple etcd cluster it makes sense.
Thanks for explaining it!
Hopefully fixed. I've also added protected method isLocalCall
, so the fail counter is reset only when actually calling remote etcd endpoint and not when creating objects locally (like in case of getCompare
method).
Also a little bit of hardening of the refreshLease
method included.
Thank you for your valuable input. Changed according to your comment.
Added client fail-over. Simple yet working solution to transparently handle failed etcd node by re-trying request on the another available etcd host.