MichaCo / CacheManager

CacheManager is an open source caching abstraction layer for .NET written in C#. It supports various cache providers and implements many advanced features.
http://cachemanager.michaco.net
Apache License 2.0
2.33k stars 456 forks source link

Async support suggestion #184

Open srivathsah opened 6 years ago

srivathsah commented 6 years ago

This is wrt https://github.com/MichaCo/CacheManager/issues/89 Async support. I suggest the usage of ValueTask in place of Task to support async await so that all synchronous code are covered. And as ValueTask is like a superset of Task asynchronous code is also covered even though only few libraries are supported.

in summary, the methods that can sometimes be asynchronous but mostly synchronous can return ValueTask instead of Task

MichaCo commented 6 years ago

Hi, primary reason I didn't added async yet is that only a very few use cases would benefit from it and only the Redis client really supports async. Everything else would run sync anyways. And, most importantly, it will double the APIs on ICacheManager which causes a lot of work, like writing all unit tests etc... I'll add it to the backlog for now..

petriashev commented 6 years ago

You can introduce new IAsyncCache abstraction with new methods and implement providers that support asyncs.

AccessViolator commented 6 years ago

I'm actually VERY surprised to see that there's no any async support in the library. While I agree that async has little sense for inproc caches, it may be useful for remote. But that's still a "nice to have" category. What I believe is an absolute must have is support for async value factories. Absence of those means I can't use factory versions of methods like GetOrAdd() if I want to retrieve missing value from database asynchronously. I have to first check if value is in the cache, then if it's not, do async db call to get the value, then add it to the cache and by this time another thread may have already added the value.

On this note a question (didn't want to dig sources) - I suppose when long running value factory lambda is in progress all other threads trying to retrieve the value from cache lock waiting for the value. Is that right? Or there's nothing preventing multiple threads from running factories to retrieve the same value and every one of them then updates the value in cache? If threads do lock it may make a noticeable difference in thread pool usage of a busy website with many short lived cache values compared to if the threads were able to instead asynchronously wait for missing value to be retrieved by the first thread.

MichaCo commented 6 years ago

@AccessViolator I totally agree regarding the async factories. I had that planned already and I know it is painful to use right now with sync over asnyc etc... Anyways,, it will get enhanced soon­™ (when I find some time to implement it)

Regarding locking for value factories. No, there is nothing preventing multiple threads running the "expensive" factory. This would be your job to implement mechanisms to prevent thundering herd problems and alike. That's way out of scope of what CacheManager is supposed to do. In distributed systems, this can be very complicated to achieve.

That being said, you can implement a simple distributed lock with CacheManager by using the Add method and e.g. Redis. Add only adds a key if it doesn't exist, now, if multiple threads or processes try to do the same thing, only one will succeed... You can "abuse" that to implement a distributed lock. Then, it is up to you to handle timeouts and errors in the process which got the lock, so that you never lock something for ever... That's where it gets more complicated and custom to your use-cases and that's why I'm not going to implement that out of the box in CacheManager

The second part of the question was, can multiple threads update the same key at the same time? Yes and no. The Update method will make sure that you always work with the most recent value. If there was a conflict, the factory will get invoked again with the new value. So, yes, conflicts will be handled. If you use GetOrAdd, only one thread will add the key, ever. Every other thread might run the factory though, but then fails to add the key and in the next retry will Get the key from cache...

All that has nothing to do with async though

Hope that all makes sense ;)

AccessViolator commented 6 years ago

Thanks for clarification.

For now I think I'll use extension methods similar to suggested in another thread, e.g.

        public static async Task<T> GetOrAddSlidingAsync<T>(
            this ICacheManager<T> self,
            string key,
            short minutes,
            Func<string, Task<T>> factory)
        {
            T value;
            if ((value = self.Get<T>(key)) == null)
            {
                value = await factory(key);
                self.Add(new CacheItem<T>(key, value, ExpirationMode.Sliding, TimeSpan.FromMinutes(minutes)));
            }
            return value;
        }
MichaCo commented 6 years ago

Yup that looks fine. Just a small thing, you don't have to use Get<T> if self is already of type T, just use Get which will be faster because it doesn't try to cast it. Then, self.Add can, in theory, return false, which would mean that between your .Get(..) == null check and Add in the meantime, another process or thread already added the same key. This can totally happen, especially if the factory takes a while to process... If Add returns false, you'd have to retry

AccessViolator commented 6 years ago

Thanks.

Re retrying - I think in most cases for factories like db retrieval it's pretty safe to assume that if another thread added the key between null check and Add using the same factory the value would be the same anyway :)

silashansen commented 5 years ago

Any progress on the async valueFactory support? 😄

andreymir commented 5 years ago

@MichaCo we need async methods to use the cache in .net core application. We use redis cache handler and under load 20 rps the application hangs on LoadScripts(). We switched back to .net 4.6.2 as a temporary fix.

MichaCo commented 5 years ago

@andreymir Do you have a way to reproduce this?

I don't thing that has anything todo with .NET Core. The problem might be that load script blocks or tries to load multiple times. So that's probably another issue we can improve to basically use a distributed lock to determine if someone is already loading the scripts. But I'm not sure. Could be other stuff too.

andreymir commented 5 years ago

@MichaCo , I created a sample https://github.com/andreymir/load-demo To run the application as net462 execute dotnet run -c Release -f net462 To run the application as netcoreapp execute dotnet run -c Release -f netcoreapp2.1 The application starts on http://localhost:5000

To create the load you need nodejs and npm. Run npm install and then npm run load. It will run the load of 3 rps on http://localhost:5000 for 5 minutes (see load.yml).

To reproduce the issue try to restart the application during the load. Under net462 configuration application restarts fine. Under netcoreapp2.1 the application hangs and does not respond.

You also need redis running on localhost:6379 for the cache.

andreymir commented 5 years ago

@MichaCo, did you be able to reproduce the issue with the demo I created?

I want to try to add async methods to ICache interface. Do you have any suggestions of how to approach this? Could it be better to create a new interface like IAsyncCache?

MichaCo commented 5 years ago

@andreymir Hey, thanks for your efforts to crate a sample repository. I did not have time yet to test it, too much other things to do.

Regarding the async support. Yes this would add new methods to ICache and all base implementations. This will be a pretty big change across all packages, especially in Redis and it would involve a lot of testing (would also most likely double the number of unit tests). I've created a feature branch for async support. If you really want to start working on this, , please send PRs for that branch

gegana commented 5 years ago

I tested this on a production service with heavy load (2 million requests within 30 min timeframe). The threads do lock, and eventually the service just puked >.<. I didn't take a deep look into what operations are blocking the threads, but really like the idea and definitely needs some optimizations.

HudsonAkridge commented 5 years ago

Any updates on this by chance? We've got a pretty large need for true async support, if possible. It would help resolve a core requirement within our consumers.

We only use Async capable cache handlers.

mjebrahimi commented 4 years ago

I made an unsuccessful attempt for Async support of CacheManager. It is difficult, time-consuming and tedious work.

@MichaCo I'll leave it here for you or anyone else has the opportunity and bored to complete it.

I have encountered insurmountable challenges like

https://github.com/mjebrahimi/CacheManager/tree/mjebrahimi/asyncsupprot

towanjust commented 3 years ago

Hi, are there any updates for Async?

How soon will the implementation be available? I'm especially interested in this on .NET Core

rahulbpatel commented 2 years ago

For now I think I'll use extension methods similar to suggested in another thread, e.g.

        public static async Task<T> GetOrAddSlidingAsync<T>(
            this ICacheManager<T> self,
            string key,
            short minutes,
            Func<string, Task<T>> factory)
        {
            T value;
            if ((value = self.Get<T>(key)) == null)
            {
                value = await factory(key);
                self.Add(new CacheItem<T>(key, value, ExpirationMode.Sliding, TimeSpan.FromMinutes(minutes)));
            }
            return value;
        }

Yup that looks fine. Just a small thing, you don't have to use Get<T> if self is already of type T, just use Get which will be faster because it doesn't try to cast it. Then, self.Add can, in theory, return false, which would mean that between your .Get(..) == null check and Add in the meantime, another process or thread already added the same key. This can totally happen, especially if the factory takes a while to process... If Add returns false, you'd have to retry

Old comment but would it work to just use AddOrUpdate instead?

            T value;
            if ((value = _cacheManager.Get<T>(key)) == null)
            {
                value = await valueFactory(key);
                _cacheManager.AddOrUpdate(key, value, (x) => value);
            }
            return value;
DmitriyBobrovskiy commented 2 years ago

I made an unsuccessful attempt for Async support of CacheManager. It is difficult, time-consuming and tedious work.

@MichaCo I'll leave it here for you or anyone else has the opportunity and bored to complete it.

I have encountered insurmountable challenges like

  • Nested async lock: I used SemaphoreSlim for AsyncLock but it does not support nested lock, unlike the regular lock statement
  • Out parameters: Async method does not support ref/out parameters which are used extensively in TryMethods

https://github.com/mjebrahimi/CacheManager/tree/mjebrahimi/asyncsupprot

Pretty long time passed already, but maybe you remember your decisions back then. I didn't dive too far but just opened BaseCacheManager.GetOrAdd and see you've faced dead-lock. My question is why didn't you decide to avoid local function and avoid out parameters? If your solution would be something like that you wouldn't face dead-lock issue image Or do I miss something?

DmitriyBobrovskiy commented 2 years ago

And about lock - why do you need nested lock? I see you took implementation of AsyncLock from EF Core, but maybe it will be better to take the implementation from AsyncEx library (or even connect Nuget to your package) AsyncLock. I'm not so good in that, but construction:

using (await _mutex.LockAsync())
{
    await Task.Delay(TimeSpan.FromSeconds(1));
}

looks promising that we can make it inherited. What do you think about that?