jaemk / cached

Rust cache structures and easy function memoization
MIT License
1.57k stars 95 forks source link

[discussion] Support Async stores #98

Closed omid closed 2 years ago

omid commented 2 years ago

Currently, all stores we have are sync, since they all are in-memory. To support async-able stores, like filesystem, Redis and other remote stores.

Now let's discuss what should we do for that?

omid commented 2 years ago

Let me start.

Some points to keep in mind:

  1. asyncable is a property of a store
  2. a store can be async and sync at the same time (like filesystem)
  3. for some stores, it doesn't make sense to have the async (like in-memory stores, like hashmap)

My suggestion is to:

  1. remove the current async feature flag
  2. maybe remove the get_or_set_with and try_get_or_set_with methods in both AsyncCached and Cached (I think they are too much to be in a trait and over-using the API)
  3. have two set of exactly same methods, in two different traits, for —the new— AsyncCached and Cached
  4. add a new property to cached macro, like async = true (and default is false)

We should be able to know if we need to use the sync or async impls in compile-time, because of that, we have the point 4 in my suggestion.

I would like to make it smarter, like if the function is async, then use the async version of store, otherwise use the sync. But first I don't know how to do it and second, I'm not sure the outcome will be good enough!

jaemk commented 2 years ago

Hi @omid ! Thanks for starting the discussion

On your points:

  1. remove the current async feature flag

I don't think we should since not everyone needs async support and disabling it cuts down compile time since there's a couple dependencies that get removed. On that topic though, I think some of the optional dependencies could be cleaned up here https://github.com/jaemk/cached/blob/cd10eac5c754ce955148e2f79b557b9dadfa04f1/Cargo.toml?_pjax=%23js-repo-pjax-container%2C%20div%5Bitemtype%3D%22http%3A%2F%2Fschema.org%2FSoftwareSourceCode%22%5D%20main%2C%20%5Bdata-pjax-container%5D#L19 where it looks like the two async-mutex async-rwlock deps should be moved to be only included when the async flag is enabled

  1. maybe remove the get_or_set_with and try_get_or_set_with

I don't remember exactly, but I believe I did it with only these two methods because of the annoyances of dealing with async blocks and incorporating them with the macros, and this simplified the problem in some way (can't remember exactly - maybe it was just to make macro integration easier in general). Either way - I think it's a convenient API worth keeping since the cache type has the opportunity to handle the get/set more efficiently than a user calling get and set separately. This also has to do with the next point because all current caches implement Cached, but only optionally implement AsyncCached when async feature is enabled.

  1. have two set of exactly same methods, in two different traits, for —the new— AsyncCached and Cached

I think I see where you're coming from here based on your initial 3 points, but I think you may be a little off here. Lets go back to your initial points quick:

asyncable is a property of a store, a store can be async and sync at the same time (like filesystem), for some stores, it doesn't make sense to have the async (like in-memory stores, like hashmap)

Sort of... in our current state, all the cache stores are really not async since there's no IO or anything about them that needs to be "async". What makes them usable interchangeably between sync and async is only the synchronization wrapper used, either std::sync locks/mutex or async_std's async-mutex/async-lock. So "asyncable is a property of a store" is a correct statement, but one that isn't currently what the separation of the Cached and CachedAsync traits "means". Note that all current caches implement Cached, but only optionally implement AsyncCached when async feature is enabled. The only thing "async" about them being that they await the block of user code passed to them.

It might make sense to introduce two new traits that cover this, where the actual implementation differs between sync and async. Maybe something along the lines of IOCached and AsyncIOCached. In practice, this would mean a cache type (like file-system or redis) would have at least two implementations (one sync and one async) and possibly two separate concrete types (depending on whether the internal fields are different, e.g. sync vs async redis connections).

  1. add a new property to cached macro, like async = true (and default is false)

We already handle automatically generating sync vs async code based on the "asyncness" of the function that we're annotating. So it wouldn't hurt to add this as an optional flag for you to do things like force an async functions to access its cache synchronously, but I don't think it's something people would want to use.


To summarize my thoughts on this:

omid commented 2 years ago

Thanks.

where it looks like the two async-mutex async-rwlock deps should be moved to be only included when the async flag is enabled

That's one of the reasons that I'm saying we need to remove async. In Rust, AFAIK, you cannot say, load async-mutex and async-rwlock only if both async and proc-macro are enabled.

Maybe something along the lines of IOCached and AsyncIOCached

Yes, something similar is exactly what I also think of. I had some other naming in my mind, like:

  1. Cached for basic/minimum sync methods
  2. AsyncCached for basic/minimum async methods (can be hidden behind async flag!)
  3. CachedHelper, for synced get_or_set_with and try_get_or_set_with, to accept sync fn
  4. AsyncCachedHelper, for asynced get_or_set_with and try_get_or_set_with to accept async fn (can be hidden behind async flag!)
  5. JustAsyncCachedHelper, for asynced get_or_set_with and try_get_or_set_with to accept synced fn and async_get_or_set_with and try_async_get_or_set_with to accept async fn (can be hidden behind async flag!)

(We can merge number 3, 4 and 5, in different circumstances. But in that case, it may be better if we remove the async feature.)

Some stores may need to impl #1, #3 and #4. (just synec ones, like the existing impls) Some others need to impl #2, #5. (just asynec ones, which cannot be sync) And some others may need to impl #1, #2, #4. (for both sync and async)

(And in case we remove the helpers, it will be clearer and will simplify things a lot :D)

We already handle automatically generating sync vs async code based on the "asyncness" of the function that we're annotating.

I know about asyncness, but they are not the same. For example, when I have a Sized cache, which is sync. I don't want to call the async version of it when calling it in an async function of mine! Actually, there shouldn't be any async version for Sized cache.

What I'm saying is that the following code shouldn't call the async version of the cache.

#[cached]
async fn cached_sleep_secs(secs: u64) {
    sleep(Duration::from_secs(secs)).await;
}

ps: Your last question is kinda out of scope of this topic, async stores. Let's talk about that somewhere else :D

omid commented 2 years ago

Thanks @jaemk for version 0.31 ❤️
So I think because of the two new traits, and the new io_cached macro, we can close this.