Closed lgothard closed 1 year ago
@rueian Is enterprise redis support on your roadmap? Love your client - currently used for prototyping. However, not able to use this client in our test/prod environments.
Hi @lgothard, thank you for trying rueidis.
I have done some research on the RedisLabs Enterprise, but unfortunately, the only way to let rueidis work with it is to make rueidis support the old RESP2, unless newer RedisLabs Enterprise releases support the RESP3.
Supporting the old RESP2 in rueidis is indeed in my roadmap, but it will take some months.
Appears RESP3/client-side caching setup with Redis Enterprise is in flight. Full support for RESP3 is currently slated for early next year.
Thank you for the update. I am really surprised that RESP3 is not yet adopted even in Redis Enterprise since 2020. I will still do my best to support the old RESP2 in rueidis.
Hi @lgothard, v0.0.77 is the first version that can work with RESP2 and RedisLabs Enterprise Docker image. You can connect to a RedisLabs Enterprise Docker instance with:
c, err := rueidis.NewClient(rueidis.ClientOption{
InitAddress: []string{"127.0.0.1:12000"},
DisableCache: true,
})
if err != nil {
panic(err)
}
defer c.Close()
Please note that client-side caching and pubsub are not yet supported in RESP2 mode.
@rueian hello, do you think the rest of RESP2 (Pub/Sub and Sentinel) may be implemented in rueidis
at some point? I spent some time trying to figure out whether I can contribute this myself - but seems that it requires much more time than I have to find the way to properly extend rueidis
:(
Just giving some context of what I am doing. As you remember I was migrating to rueidis
– and everything seems to work fine now. But the OSS project I am working on (Centrifuge) is used by users who have different Redis setups - including RESP2-based. So I can not simply switch from redigo
to rueidis
in backwards compatible way.
While I was experimenting with rueidis
Centrifuge also got pr to switch from redigo
to go-redis
. In the end I'd like to have only one implementation – so I want to compare go-redis
and rueidis
and choose between them. Most probably this is also a good topic for the blog post - as we will have implementations in three different Redis libraries.
Also, one thing I am concerned about when choosing the implementation (besides number of dependencies and performance) is a possibility to seamlessly run Redis code with KeyDB, DragonflyDB, and other Redis-compatible stores. Redigo works fine with those, but I still have to check go-redis
and rueidis
(think the main issues may be during initial handshakes - when client sends HELLO, chooses protocol, relies on info['version']
).
Hi @FZambia,
I am sure the rest of the RESP2 functions will be implemented at some point. maybe this month or next month. I just also need some time to figure out how to do it.
Thank you for mentioning other Redis-compatible stores. I tested with KeyDB a few months ago and it worked fine at that time. DragonflyDB looks very promising but I haven't tried it. Anyway, I think I can add some integration tests with them.
Hi @FZambia,
Just as you already know, rueidis v0.0.80 was released with RESP2 PubSub and Sentinel supported. And starting with the latest v0.0.81, KeyDB, DragonflyDB, and Kvrocks are added to the integration tests in single-node mode. So they should work with rueidis at least in single-node mode. I may also try to test their cluster mode in near future.
In terms of compatibility, they all support PubSub. KeyDB supports RESP3 and Client Side Caching, while DragonflyDB v0.9.1 and Kvrocks v2.0.6 only support RESP2.
Yep, many many thanks! This is awesome, and we already have pr – Rueidis shows great results on Centrifuge benchmarks, especially in allocation efficiency.
@rueian btw, have a small concern, a bit unrelated to this topic, but I was thinking about it during the migration to rueidis
.
Currently rueidis
have several ways to configure various timeouts. But I think the only working way which currently exists in Rueidis to detect the connection that is stack forever is background PING goroutine. With background PING client can ensure that connection is still alive, and if timeout passes (ConnWriteTimeout
) - then the connection is closed and current outgoing requests may be released.
But regarding TCP keepalives – I am still not sure those are enough in cases where some proxy between the app and Redis is used, keepalives only check that connection between the app and the proxy is alive, and then depending on proxy I suppose it's not 100% guaranteed that keepalives from proxy to Redis are sent. In this perspective having global ReadTimeout
per connection had a lot of sense to me when using Redigo.
To be honest, not sure whether it can be improved or background PING goroutine is enough. Also, just to mention - I am trying to avoid cancelling requests to Redis over the context as it adds a notable overhead when context should be created for each request, so I prefer having a single globally defined timeout for all requests from the application to Redis.
Hi @FZambia,
But regarding TCP keepalives – I am still not sure those are enough in cases where some proxy between the app and Redis is used, keepalives only check that connection between the app and the proxy is alive, and then depending on proxy I suppose it's not 100% guaranteed that keepalives from proxy to Redis are sent.
I am now sure that TCP keepalive is not enough to detect broken links even without proxies in between. Because the kernel will start keepalive only if the TCP write buffer is empty. If there is outgoing data not being acked by the receiver, the keepalive will never kick-off.
In this perspective having global ReadTimeout per connection had a lot of sense to me when using Redigo.
I haven't checked how Redigo implements ReadTimeout, but this is quite surprising to me. I actually thought there were many cases that couldn't be applied with ReadTimeout. For example, blocking commands like BZPOPMAX
, and especially, the pubsub command like SUBSCRIBE
. In these cases, the client just didn't know when will be a response and should wait or watch indefinitely. If ReadTimeout was applied, it might just produce false results and unnecessary reconnections.
In other words, ReadTimeout makes sense to me only when I expect there will be a response before being timed out. That is why rueidis
uses a background goroutine to send PING
instead of setting ReadTimeout on the connection directly to keep the above commands functioning and still be able to detect if the connection is stuck.
the pubsub command like SUBSCRIBE. In these cases, the client just didn't know when will be a response and should wait or watch indefinitely. If ReadTimeout was applied, it might just produce false results and unnecessary reconnections.
In Redigo's case it is possible to ReceiveWithTimeout
- to provide custom timeout for PUB/SUB connections, and the recommendation is to periodically ping PUB/SUB connection.
As I said I am not sure sth should be changed in rueidis
, probably current approach with background automatic PING is even superior, just sth I was thinking about.
Hey @FZambia,
In Redigo's case it is possible to
ReceiveWithTimeout
- to provide custom timeout for PUB/SUB connections, and the recommendation is to periodically ping PUB/SUB connection.
Thank you for letting me know how Redigo uses ReadTimeout. I haven't tried the example, but one small guess is that using ReceiveWithTimeout
but forgetting to ping the connection periodically will result in a false timeout if there is really no message for the subscription, right?
As I said I am not sure sth should be changed in
rueidis
, probably current approach with background automatic PING is even superior, just sth I was thinking about.
Though I believe the current background PING mechanism is enough, I also believe there is room for improvement. For instance, try using the built-in Conn.SetReadDeadline
for minimizing the detection overhead just like how Redigo does in the above example. Actually rueidis
does use Conn.SetDeadline
before it starts pipelining. However, I haven't found the right positions to set read and write deadlines once it starts pipelining and serving concurrent requests.
Another direction to improve I am thinking of is to reduce the PINGs. Given that the purpose of PINGs is to detect broken connections, I believe there is no need to send PINGs on a connection that keeps fulfilling other normal requests.
forgetting to ping the connection periodically will result in a false timeout if there is really no message for the subscription, right?
Yep (and that was sth I struggled with when first integrated redigo).
In general, with all those timeouts, my main goal is to have a possibility to set some global value which won't be exceeded for each operation if sth went wrong, it seems that with current background PING it can be achieved – so seems good.
As more general feedback, I really enjoyed building commands with Rueidis - it drastically reduces a chance to make a stupid mistake while constructing a command. Instead of always opening Redis docs to see a command syntax it's now possible to just start typing - and quickly come to the result.
Hi @FZambia,
Just let you know that v0.0.82 contains a fix to a case where background PING getting stuck and not close the connection as expected. I found this problem while investigating the issue #108. They all happened when the underlying ring was full.
v0.0.82 (#114) fixes this by putting p.Do(ctx, cmds.PingCmd)
into another goroutine and using another timer to wait the PING result.
Any updates on this peeps! I would love to use an enterprise solution as a service from azure
Hi @nicolastakashi,
I don't have a azure account, but I have tried Redis Enterprise docker image.
The following setup should work with it:
c, err := rueidis.NewClient(rueidis.ClientOption{
InitAddress: []string{"127.0.0.1:12000"},
DisableCache: true,
})
if err != nil {
panic(err)
}
defer c.Close()
@rueian hello! I just finished a draft of blog post about our migration from Redigo to Rueidis in Centrifuge/Centrifugo - maybe you will be interested to take a look and come up with some comments? Rueidis migration is still not merged but I hope it will at some point soon. Here is a blog post draft I am talking about: https://github.com/centrifugal/centrifugal.dev/pull/18
Hi @FZambia,
I have read the great post and thank you for sharing benchmark results of rueidis. It is my honor that my works can be listed in the Centrifugo blog. I am also glad to know that rueidis really helps others to push their application and redis to the limit.
Just knowing that you have already done smart pipelining three years ago from your previous blog post about scaling websocket. I hope I could read that blog earlier so that rueidis might also born earlier. The pipelining technique is basically the same as what rueidis uses.
I also love the video you take to demonstrate the command builder. Can I also put the video in the readme of rueidis as well once you published the post?
One thing that caught my eye is the benchmark of RedisSubscribe
getting worse after switching to rueidis. Maybe I could also do some research on it.
I also love the video you take to demonstrate the command builder. Can I also put the video in the readme of rueidis as well once you published the post?
Of course, and no need to wait for post publication.
One thing that caught my eye is the benchmark of RedisSubscribe getting worse after switching to rueidis. Maybe I could also do some research on it.
Yep, this is a good question why it's slower – as I mostly do same things as in the Redigo's implementation. I spent some time trying to find obvious reason. The only thing I found is that in my case having:
p.nsubs.Confirm(values[1].string)
in pipe.go
adds some overhead. Since I am using PUB/SUB hooks and managing subscriptions on the application level having subscription registry inside rueidis
seems not really required. I tried to disable it with new option inside rueidis.PubSubHooks
struct – and it actually started to perform a bit better, sth like 6% latency reduction.
With registry disabled I got bench results like this (old is with registry, new is with disabled registry):
benchstat before.txt after.txt
name old time/op new time/op delta
RedisSubscribe/non_cluster-8 1.64µs ±10% 1.53µs ±12% -6.39% (p=0.023 n=10+10)
name old alloc/op new alloc/op delta
RedisSubscribe/non_cluster-8 846B ± 5% 699B ± 0% -17.33% (p=0.000 n=10+9)
name old allocs/op new allocs/op delta
RedisSubscribe/non_cluster-8 10.0 ± 0% 10.0 ± 0% ~ (all equal)
I thought about whether it's possible to always disable registry when someone uses PubSubHooks
to avoid adding new option - but not sure whether there are use cases which may be broken after doing that.
Re-running with changes above and with 30 repetitions gives the following comparison between Redigo case and Rueidis case:
❯ benchstat redigo_latency.txt rueidis_latency.txt
name old time/op new time/op delta
RedisSubscribe/non_cluster-8 1.46µs ±30% 1.59µs ±13% +9.31% (p=0.022 n=30+30)
name old alloc/op new alloc/op delta
RedisSubscribe/non_cluster-8 895B ± 2% 699B ± 0% -21.95% (p=0.000 n=30+30)
name old allocs/op new allocs/op delta
RedisSubscribe/non_cluster-8 22.0 ± 0% 10.0 ± 0% -54.55% (p=0.000 n=27+30)
With registry disabled I got bench results like this (old is with registry, new is with disabled registry):
benchstat before.txt after.txt name old time/op new time/op delta RedisSubscribe/non_cluster-8 1.64µs ±10% 1.53µs ±12% -6.39% (p=0.023 n=10+10) name old alloc/op new alloc/op delta RedisSubscribe/non_cluster-8 846B ± 5% 699B ± 0% -17.33% (p=0.000 n=10+9) name old allocs/op new allocs/op delta RedisSubscribe/non_cluster-8 10.0 ± 0% 10.0 ± 0% ~ (all equal)
Wow, this result is amazing. How does it even reduce the alloc/op?
But, unfortunately, p.nsubs.Confirm()
can't be skipped because it actually handles the issue https://github.com/rueian/rueidis/issues/55.
However, I think it is still possible to make p.nsubs.Confirm()
be much cheaper for the issue https://github.com/rueian/rueidis/issues/55. Maybe it is even possible to remove the mu.Lock()
when doing p.nsubs.Confirm()
.
Hi @FZambia,
I just removed the need of p.nsubs.Confirm(values[1].string)
in https://github.com/rueian/rueidis/pull/147.
Here is the benchmark code and the result on my macbook:
func BenchmarkSubscribe(b *testing.B) {
c, _ := NewClient(ClientOption{InitAddress: []string{"127.0.0.1:6379"}})
defer c.Close()
b.SetParallelism(128)
b.RunParallel(func(pb *testing.PB) {
for pb.Next() {
if err := c.Do(context.Background(), c.B().Subscribe().Channel(strconv.Itoa(rand.Int())).Build()).Error(); err != nil {
panic(err)
}
}
})
}
▶ benchstat old.txt new.txt
name old time/op new time/op delta
Subscribe-10 1.40µs ±10% 1.25µs ± 4% -10.73% (p=0.000 n=30+30)
name old alloc/op new alloc/op delta
Subscribe-10 499B ± 0% 337B ± 0% -32.42% (p=0.000 n=25+27)
name old allocs/op new allocs/op delta
Subscribe-10 5.00 ± 0% 5.00 ± 0% ~ (all equal)
▶ cat old.txt
goos: darwin
goarch: arm64
pkg: github.com/rueian/rueidis
BenchmarkSubscribe-10 883592 1460 ns/op 520 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1434 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1394 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1538 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 971233 1458 ns/op 503 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1356 ns/op 498 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1457 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1434 ns/op 498 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1545 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 951518 1387 ns/op 507 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1344 ns/op 498 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1452 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1368 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1435 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1342 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 982371 1369 ns/op 501 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1339 ns/op 498 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1384 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1342 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1374 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1379 ns/op 498 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1355 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 975186 1449 ns/op 503 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1359 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1398 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1331 ns/op 498 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1416 ns/op 498 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1414 ns/op 499 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1376 ns/op 498 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1313 ns/op 499 B/op 5 allocs/op
PASS
ok github.com/rueian/rueidis 55.429s
▶ cat new.txt
goos: darwin
goarch: arm64
pkg: github.com/rueian/rueidis
BenchmarkSubscribe-10 1007948 1233 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1235 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1231 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1270 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1251 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1226 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1271 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1298 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1264 ns/op 336 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1220 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1277 ns/op 336 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1239 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1228 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1259 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1234 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1230 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1244 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1249 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1227 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1290 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1280 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1302 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1248 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1218 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1232 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1228 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1244 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1261 ns/op 337 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1275 ns/op 336 B/op 5 allocs/op
BenchmarkSubscribe-10 1000000 1230 ns/op 337 B/op 5 allocs/op
PASS
ok github.com/rueian/rueidis 50.790s
It indeed allocs less now because there is no need to track subscribed channels anymore.
Thanks! For my bench I also got a speed up, comparable to the case where I just blindly removed p.nsubs.Confirm(values[1].string)
calls.
benchstat subscribe1.txt subscribe2.txt
name old time/op new time/op delta
RedisSubscribe/non_cluster-8 1.56µs ± 4% 1.46µs ± 9% -6.39% (p=0.001 n=10+10)
name old alloc/op new alloc/op delta
RedisSubscribe/non_cluster-8 835B ± 6% 700B ± 0% -16.19% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
RedisSubscribe/non_cluster-8 10.0 ± 0% 10.0 ± 0% ~ (all equal)
Of course as you remember my bench involves some code outside rueidis
so results are slightly different compared to yours.
Hi all, I am going to close this issue since rueidis already supports RESP2 since v0.0.80 and should work with Redis Enterprise. Please feel free to open new issues If you want to.
When using the RedisLabs Enterprise Docker image, I'm not able to make a connection due to the HELLO command not being supported in the enterprise edition ...
$ /opt/redislabs/bin/redis-cli -p 12000 127.0.0.1:12000> HELLO (error) ERR unknown command 'HELLO'
I don't see this called out on the compatibility list but confirmed with RedisLabs.