Open kemingy opened 1 year ago
Hey Keming, interested in taking a look at this issue, I briefly looked into some rust crates for this feature and found this crate. This crate seems to have support for redis cache, sized cache and timed cache (although i dont believe they have timed + sized cache). My first thought would be to add an axum middleware to handling the caching logic. What are your thoughts on this?
Hey Keming, interested in taking a look at this issue, I briefly looked into some rust crates for this feature and found this crate. This crate seems to have support for redis cache, sized cache and timed cache (although i dont believe they have timed + sized cache). My first thought would be to add an axum middleware to handling the caching logic. What are your thoughts on this?
I think this PR should come with a benchmark. I don't know if this lib fits our requirements.
I don't know how it handles the cache key. Since the key/value could be a huge image (like 3 x 1000 x 1000 f32)
. The benchmark should include different key/value types like a simple string, an image, an embedding, etc.
Good point. Do you think the cache should be aware of the exact content type?
Good point. Do you think the cache should be aware of the exact content type?
No. Because we don't really parse the HTTP request body on the Rust side. I list different types of data just because their sizes are different.
For the benchmark, you can check https://github.com/tensorchord/inference-benchmark/tree/main/benchmark
Describe the feature
refer to:
Some ML models might benefit from the cache.
As for the storage part, I think ideally we should support both local and remote cache.
Why do you need this feature?
No response
Additional context
No response