Ring Client support - Githubissues

szuecs commented 4 months ago

I would like to have a ring client support similar to go-redis ring. It basically enables you to use shards to scale out redis.

There are a couple of feature that we would need in this ring client, that is likely good to know in advance:

1) In our case we use it for rate limits at the ingress layer, so we have a lot operations per second that will exceed a single redis process and require redis shards.

2) One key feature for us would be also to be able to update the shards like in https://github.com/redis/go-redis/blob/b64d9deef330a51cfbd3e0425b6c26b27c1ee370/ring.go#L538-L540 (https://github.com/redis/go-redis/commit/6f7f800107ba67310cd822d35b4558255f702ff1).

3) In some cases we need even to built a vnode layer on top to spread operations to more shards https://github.com/zalando/skipper/blob/master/net/redisclient.go#L218 , so it would be great if we could have an option to set also the ConsistentHash function to be used. In general rendezvous hash was the best option in my experience.

rueian commented 4 months ago

Hi @szuecs,

We currently don't support the ring sharding because we already support the standard Cluster. I actually don't understand why to use the ring sharding over the standard Cluster. But I will still be grateful to accept a PR of ring client implementation.

szuecs commented 4 months ago

The problem is that with cluster you create a safe setup but not a scalable one. We need a non safe scalable system. https://redis.uptrace.dev/guide/ring.html#keys-distribution A ring client does a rendezvous hash (much better than crc16 modulo N) of the key and every redis is just a single store and has no overhead to do master/slave like behavior. The ring has no problems like partitioning or complete downtime if half the nodes are down.

rueian commented 4 months ago

has no overhead to do master/slave like behavior.

The standard cluster doesn't need to have replicas either. You can have a cluster that consists of only primary nodes.

The ring has no problems like partitioning or complete downtime if half the nodes are down.

That is an infrastructure failure that users should take very seriously. Normally, no one would like to run applications on such an unstable infrastructure. I believe even the users of the ring don't want to lose half of their nodes. So treating the ability to survive in such a case as a selling point seems unwise to me.

But, since the ring is so simple, I am not against having a ring implementation in this library. Here are some tips if someone wants to implement it:

The ring implementation can be a slim layer on top of multiple valkey.Clients. This simplifies the implementation:

ring := valkey.NewRing(valkey.RingOption{
Shards: map[string]valkey.Client{
    "shard1": client1,
    "shard2": client2,
}
})

The implementation can use the cmd.Slot() as part of the rendezvous hash:

cmd := client.B().Get().Key(key).Build()
cmd.Slot()

szuecs commented 4 months ago

has no overhead to do master/slave like behavior.

The standard cluster doesn't need to have replicas either. You can have a cluster that consists of only primary nodes.

The ring has no problems like partitioning or complete downtime if half the nodes are down.

That is an infrastructure failure that users should take very seriously. Normally, no one would like to run applications on such an unstable infrastructure. I believe even the users of the ring don't want to lose half of their nodes. So treating the ability to survive in such a case as a selling point seems unwise to me.

Sure, I just read the spec for "cluster" and tried to think about all possible advantages to answer your question. I think the biggest advantage is simplicity and more stable key distribution. Also in case of auto scaling crc16 mod N will move around all keys and as far as I understand rendezvous is a generalization of consistent hash so it should also have more stable shard selection in case of scaling attempts.

But, since the ring is so simple, I am not against having a ring implementation in this library. Here are some tips if someone wants to implement it:
1. The ring implementation can be a slim layer on top of multiple `valkey.Client`s. This simplifies the implementation:
ring := valkey.NewRing(valkey.RingOption{
    Shards: map[string]valkey.Client{
        "shard1": client1,
        "shard2": client2,
    }
})

Thanks makes a lot of sense to me.

2. The implementation can use the `cmd.Slot()` as part of the rendezvous hash:

cmd := client.B().Get().Key(key).Build()
cmd.Slot()

I will check cmd.Slot(), thanks for the pointer! I think you mean all the Slot() in github.com/valkey-io/valkey-go/internal/cmds/cmds.go , right?

autoscaling feature

What do you think about func (*Ring) SetAddrs(addrs map[string]string) to implement the feature I asked to implement 2.? Do you see any blocker? Maybe this one should be just func (*Ring) SetAddrs(addrs []string)

vnode hashing feature

For 3. we would also need to expose an interface to override the hashing function similar to https://github.com/redis/go-redis/blob/b64d9deef330a51cfbd3e0425b6c26b27c1ee370/ring.go#L66

Maybe I should create separate issues for autoscaling feature and vnode hashing feature. What do you think about this?

rueian commented 4 months ago

I think you mean all the Slot() in github.com/valkey-io/valkey-go/internal/cmds/cmds.go , right?

Yes, the value returned from Slot() functions is the hash used by the standard cluster.

Maybe I should create separate issues for autoscaling feature and vnode hashing feature. What do you think about this?

No need. We can address this in this issue. It looks like we need an interface to pick a shard, for example:

type RingShardPicker func(cmd valkey.Completed, activeShards []string) string

What do you think about func (*Ring) SetAddrs(addrs map[string]string) to implement the feature I asked to implement 2.?

No blocker, but I would prefer this one:

func (*Ring) AddShard(shard string, client valkey.Client)
func (*Ring) DelShard(shard string)

The reason is that I want to keep the implementation simple and avoid constructing valkey.Client in the implementation.

szuecs commented 4 months ago

I think you mean all the Slot() in github.com/valkey-io/valkey-go/internal/cmds/cmds.go , right?

Yes, the value returned from Slot() functions is the hash used by the standard cluster.

As far as I see Slot(), just returns the already calculated crc16 value by accessing the member. I would say we need to change SetSlot(key string), that calculates the hash or I miss understood your suggestion.

rueian commented 4 months ago

We don't need to use the SetSlot(key) because the crc16 value should already be calculated during the command construction:

cmd := client.B().Get().Key(key).Build() // the crc16 is calculated here.
cmd.Slot() // we just need to retrieve it by using Slot().

I think we could implement the rendezvous hash like this:

func RendezvousRingShardPicker(cmd valkey.Completed, activeShards []string) string {
    slot := cmd.Slot()
    sort.Slice(activeShards, func(i, j int) bool {
        return (slot ^ hash(activeShards[i])) < (slot ^ hash(activeShards[j]))
    })
    return activeShards[0]
}

rueian commented 4 months ago

I just reconsidered the RingShardPicker I proposed, and I found it left no space for performance optimization. The above RendezvousRingShardPicker will work but its performance is not acceptable.

To have spaces for performance optimization, we might still need an interface like this:

type RingShardPicker interface {
    AddShard(shard string)
    DelShard(shard string)
    Pick(cmd valkey.Completed) (shard string)
}

This interface allows implementations to have their optimized structure for faster lookup.

szuecs commented 4 months ago

We don't need to use the SetSlot(key) because the crc16 value should already be calculated during the command construction:
cmd := client.B().Get().Key(key).Build() // the crc16 is calculated here.
cmd.Slot() // we just need to retrieve it by using Slot().

My thinking was more to replace crc16 by rendezvous. As far as I know double hashing is not very good, because it won't have the properties of the one hash function that you want to use. I think it would be something like:

cmd := ringClient.B().Get().Key(key).Build() // rendezvous(key) is calculated here, maybe via Pick() from RingShardPicker
cmd.Slot() // does not change as before use the cached value

I just reconsidered the RingShardPicker I proposed, and I found it left no space for performance optimization. The above RendezvousRingShardPicker will work but its performance is not acceptable.

To have spaces for performance optimization, we might still need an interface like this:
type RingShardPicker interface {
    AddShard(shard string)
    DelShard(shard string)
    Pick(cmd valkey.Completed) (shard string)
}
This interface allows implementations to have their optimized structure for faster lookup.

The interface seems to make a lot of sense, thanks! I would think that a ringClient would implement similar to Client, but instead of crc16 it uses rendezvous and ringClient also implements RingShardPicker (at least for AddShard/DelShard).

rueian commented 4 months ago

Hi @szuecs,

cmd := ringClient.B().Get().Key(key).Build() // rendezvous(key) is calculated here, maybe via Pick() from RingShardPicker

We can't change the crc16 calculation. The underlying valkey.Client still needs it.

If you want to re-calculate a hash completely, you can calculate it from the cmd.Commands() which returns all command tokens. But it just seems unnecessary to me.

The rendezvous implementation used by go-redis also uses XOR to combine two hashes which, I believe, is similar to the RendezvousRingShardPicker I previously proposed.

szuecs commented 4 months ago

@rueian yes I see, thanks you are likely right. I will go with your proposed way.

valkey-io / valkey-go

Ring Client support #5

autoscaling feature

vnode hashing feature