valkey-io / valkey-glide

An open source Valkey client library that supports Valkey and Redis open source 6.2, 7.0 and 7.2. Valkey GLIDE is designed for reliability, optimized performance, and high-availability, for Valkey and Redis OSS based applications. GLIDE is a multi language client library, written in Rust with programming language bindings, such as Java and Python
Apache License 2.0
261 stars 53 forks source link

Rust client #828

Open waynerobinson opened 9 months ago

waynerobinson commented 9 months ago

Describe the feature

Given Glide's core is Rust and supports some useful additions to redis-rs in terms of connectivity & resilience to MemoryDB/Elasticache services it would be valuable to be able to use this with Rust direclty.

Use Case

Redis client for Rust.

Proposed Solution

No response

Other Information

No response

Acknowledgements

Client version used

n/a

Environment details (OS name and version, etc.)

n/a

shachlanAmazon commented 9 months ago

In the short term, we plan on focusing our efforts on higher-level languages, so this probably won't happen in the next few months. In the meanwhile we make every effort to upstream our changes to redis-rs, so if you use it, you will eventually receive most, if not all, the capabilities of GLIDE for Redis. In the meanwhile, if there's any specific feature you're missing or bug that's bothering you, feel free to open an issue on redis-rs, and I'll try to fix it there ASAP.

nakedible-p commented 1 month ago

FWIW, we just picked Fred as our Redis client for Rust: https://github.com/aembke/fred.rs

The reasons being (based on a brief evaluation): 1) natively async, so we don't need to look at this on every setting, 2) native pool support, 3) better connection management / recovery support, 4) client tracking (caching) support (we haven't taken this into use yet, but will).

This is not to rain on anyone's work, but I'm just hoping that it's realized that the current redis client has downsides, and people might be hoping for a better redis client coming from this repo.

avifenesh commented 1 month ago

@nakedible-p Hi Nuutti, I guess you evaluated vs. redis-rs.

We do plan to expose Rust client at some point, but when will do, it will be after we will have major differences from the upstream — redis-rs.

Redis-rs is robust and have a wide support to all kind of rust users and methodologies, which has pros and cons. I encourage you to open an issue there and give them your feedback, I know @nihohit, one of the main maintainer and I know he would like to hear what you have to say.

For example — they do support client side caching, and their connection management is very stable. The design of using multiplexer instead of connection pool is after deep evaluation. And the decision behind it is that connection pool is error prune and expose you connection storm easily which will block your app completely, and It's very hard to recover from it. Many of the decision we took and upstream, and the decisions been made by @nihohit in redis-rs is based on data gathered from hundreds and thousands elasticache users, while dealing with their pain with other clients. For example, we made a strong effort on better topology change detection, since it's something that repeat on itself as a big problem in most of the client. In the last two weeks I'd spent long time helping customers of the two main Java clients and the main nodejs client to deal with the topology changes issue, while with glide it just worked smoothly. All of them will move to glide. So as said, having a chat with @nihohit can give you some insights on what we saw as important throw the years, may there are something you missed.

It is not something we formulated and not a team decision, just my assumption: When we will expose a Rust client, it will much thinner. We support async only in all our clients and there are no plans to change it in the near future, we probably end up striping the fork from what is non-relevant to our client.

We will focus on one methodology, stability and reliability, features that we know users are value the most and feature request, performance, external tooling integration and cloud vendor support.

There's also the fact that the Rust logic is the one to serve all the other client we offer. Making it very verbose and robust cost for all the other clients, and cost developing time, which affect all the clients together. So my assumption is that when we would expose the Rust client it will serve users that need a client similar to what we need (the list I mentioned) but will offer fewer methodologies support. For the user need the other, there's redis-rs which is a very good client and I don't see a reason to try to replace it.

But as I said before, my statement is not representing some team decision, or something we discuss seriously, it's just the idea that popped a couple of times, and we will decently still need to discuss and take a decision.

Please share how did you evaluate and what are the stuff were important to you while doing so.

nakedible-p commented 1 month ago

Thank you for answering in such detail! A lot of this is a bit off-topic for this issue, but I'll just put this here so others can see it as well if it ends up being relevant.

First off, on NodeJS, we are currently struggling with topology change detection on node-redis (against MemoryDB). We are currently evaluating switching to ioredis in the meantime to see if that would have better implementation out of the box – but do you consider that we could already use glide nodejs support in production? We are no stranger to being first adopters and using something experimental, but we obviously don't want to use something that's inherently not meant for production yet.

On the connection pool – it's been hard to find proper guidance on what is the best practice with a highly available solution. Our rationale for picking a small pool (4) for connections:

However, we really just want a setup that is as simple as possible, but recovers from any kind of issue as fast as possible, making no compromises there. If, through experience, that is shown to be a single connection that is just re-established as fast as possible in case the server fails to communicate for 4 seconds (for example), then this is exactly the kind of feedback we'd love to have.

As for redis-rs:

But, these are superficial findings – I did not try to make a working client with redis-rs and test all the different failure scenarios. Maybe at some point I will.

And again, to be clear, I'm not complaining! I just try to provide honest feedback. If ninohit doesn't pop by to say "noted", I will probably post these as an issue in the redis-rs repo as well, just to have them recorded.

Finally, I must applaud your goals. Right now it seems hard to find a battle-proven redis client that would be simple to set up and use. Figuring all this out is just time away from the real work of using redis as a solution. So this is exactly what redis/valkey/elasticache/memorydb need to be a more out-of-the-box solution.

nihohit commented 1 month ago

Hi, a redis-rs maintainer here. NOTED! :)

Before I start, I want to say that I fred looks like an excellent client, and you should probably be fine with using it. I checked out several things in its source code, and the what I would call the "strategic" difference in its design from redis-rs is that fred uses very high level abstractions, such as multilple multicast channels, in order to provide its logic*. This allows the library to iterate fast and provide a lot of features, but it means that performance-wise, when I benchmarked the 2 clients, redis-rs was measurably faster. That was a while ago - I assume that the numbers changed on both clients. If perf is important, measure for your specific scenario. IMO that means that if fred provides the operations-per-second you need, you can probably be happy with your choice, but if it doesn't, redis-rs still might. The other consideration is that redis-rs is more popular, so you can assume that bugs will be caught and fixed faster. BTW, @avifenesh , you guys are missing out on a couple of bug fixes & features by not pulling from us :)

*take this with a grain of salt - I never went very in-depth into fred's code.

I didn't see "tracking" or "caching" mentioned in any of the features, and I saw a bunch of open pull requests, so I assumed it's not done. Glad to hear it's there.

It's not there, it's in an open PR :)

I want absolutely nothing to block (can I somehow disable blocking commands altogether?). I'm scared that even with all the aio features, some cluster data refresh is going to block. Clients that are not built async from the ground up often may think it's okay to block sometimes.

I'm sorry, but I really don't understand this logic. If you want to make sure there's nothing blocking, read the code. (or run tokio with the relevant checks). The chance that a person introduces blocking calls by mistake is not based on whether a library is async or not, but by the experience and care that the maintainers & community put it. redis-rs is has twice the downloads of fred, but sadly we don't have number on who uses which features. I believe that more than half of our users use the async API, so I believe that there are more people who'd report such a mistake on redis-rs than on fred.

I want to use a cluster connection with tokio async utilizing a connection manager

Why? the cluster connection already reconnects automatically. adding a connection manager on top of this is unnecessary.

I don't want to go into all your other points - some I agree with, others less so. You're more than welcome to open an issue per request on redis-rs, and we'll try to attend to them. But, again - fred looks like a good client. Since you're already using it, I hope that it serves you well :)

avifenesh commented 1 month ago

@nakedible-p For redis-rs comments, I am happy to see that @nihohit is here, he is really the guy you should talk too. On redis-rs specially but also generally working with Valkey/Redis using Rust. As he said, many times the decision is between simplicity and abstraction to performance. If you want to keep the conversation here so others can see, do so, just mention it in an issue in redis-rs so also there it'll be available for everyone.

And for Glide nodejs - we are going for a first stable version very soon, and if you are going for POC anyway with another client, especially if the cause is topology change detection issues, im encouraging you to take Glide for a ride and a POC. I'd love to take part in it and accompany you in the process.

I did it with our node beta testers, two major companies, and any feature request was deployed in two weeks after the request (supporting MUSL for example, or Emma plus commonjs).

As for topology changes detection. I spent the whole day debugging the codebase of the library mentioned for a very large user having issues with topology changes detection. The issue blocked his ability to serve requests for 7 ( ! ) minutes. It's a huge gaming company so you can imagine how problematic 7 minute blocks are. Also with memorydb.

I haven't run it against Glide yet, but i already know that it can't happen with Glide, and the second i find how to solve it for them (they might need to clone the repo and patch) I'm going to run this test with Glide and would love to share the results.

Let me know if you are going with us.

You can approach me personally in our discord - https://discord.gg/Wt7n7NzfcZ

avifenesh commented 1 month ago

@nihohit Right after the release we will scan the whole redis-rs repo and merge!

nakedible-p commented 1 month ago

Thank you for the swift reply! And yeah, not all points are possibly "fair", as these are just impressions, so take them as you will.

I want to use a cluster connection with tokio async utilizing a connection manager

Why? the cluster connection already reconnects automatically. adding a connection manager on top of this is unnecessary.

See, that's the part that I missed – I did find it in the documentation eventually, though.

All in all, I'm still a bit unsure about redis crate usage to get something that reliability with all the reconnections.

If I'm not mistaken, after all this reading, it might be at process start, I would do:

let connection = ClusterClient::builder(vec!["redis://myhost/"])
    .retries(4)
    .max_retry_wait(1000)
    .min_retry_wait(1000)
    .connection_timeout(Duration::from_secs(10))
    .response_timeout(Duration::from_secs(4))
    .build()?
    .get_async_connection().await?;

And then stash that in a static global. (Timeout values etc. just in the ballpark, not well thought out.)

And then on every usage site I would need to have AsyncCommands trait imported and I'd do:

let _: () = CONNECTION.clone().set("foo", "bar").await?;

And that would give me a connection that:

If that's true, then ultimately there isn't much difference in the approaches. It was just hard for me to figure all this out, and the mutable connection object kind of threw me off as I need a single shared connection/client that's valid for all threads and lives for the lifetime of the application.

nihohit commented 1 month ago

and the mutable connection object kind of threw me off

sigh You're not the first person to point that out, and I don't like that design choice either. It was made before I joined the team 🤷 . The only thing I can point out is that if we ever breack our API, that's definitely on the list :)https://github.com/redis-rs/redis-rs/issues/1043

nihohit commented 1 month ago

@avifenesh

You can approach me personally in our discord - https://www.linkedin.com/in/ACoAACz2yQ4BF4xYt14QGsx4IF9kSzT10S5gXX0

That's a link to your linkedin profile, not to a discord server