ZiggyCreatures / FusionCache

FusionCache is an easy to use, fast and robust hybrid cache with advanced resiliency features.
MIT License
1.92k stars 97 forks source link

Feature events #18

Closed JoeShook closed 3 years ago

JoeShook commented 3 years ago

@jodydonetti what do you think of this code to hook the MemoryCache's various eviction reasons.

jodydonetti commented 3 years ago

Hi Joe, I'll take a look tonight and let you know!

jodydonetti commented 3 years ago

Hi, sorry for being late, I now started looking at this pr.

@jodydonetti I get different results between subscribing to cache.Events.Hit and cache.Events.Memory.Hit. I would expect the results to be the same. My experiment is with only a MemoryCache and not Distributed cache and FailSafe enabled. The more I experiment with the Unittest the more unsure I am of what the right answer is. I think this just need a bit more work or I am missing something.

Ok I'm looking into it, will let you know asap.

In general I see you've tried to add support for internal evictions, too (like Capacity, etc): I was unsure to add those, even though I understand they can be interesting metrics to know, mostly because they usually are not what people may expect. An example is that the actual expiration - which I see you've added as well - is in reality not executed until later on, and not when the entry expires (this is because of how the standard MemoryCache internally works). Another example is that if you put something into the cache with a 5 min logical duration, but with fail-safe enabled and a FailSafeMaxDuration of 2 hrs, the cache entry will actually expire after 2 hrs, not 5 min. What would you expect to happen here?

I think in the end my question for you is: how important are these specific additional metrics (Expired/Capacity/Replaced/Eviction) to you, knowing that they may not be what they seem to be? Let me know what you think.

jodydonetti commented 3 years ago

Meanwhile, I'm making a couple of small fixes anyway which I'll push in a moment.

JoeShook commented 3 years ago

So if a consumer does not enabled Fail-Safe then the evictions are as informative as they are if they used MemoryCache directly.

I think the eviction hooks would be most useful for out of the ordinary behavior when troubleshooting strange cache behavior. I think most would always use the FusionCache fail-safe feature. But maybe sometimes they would not. Another reason I added it was I did not want to explain to someone why I didn't add it. :)

I really don't know what to expect from these eviction notifications. Mostly I have seen them in other code so I assumed someone would care to see them.

I worked a little harder at removing much of the Eviction code and moving it to the "plugin". I committed those changes. Also I have a working playground repo to experiment with the metrics plugin ideas (not official plugins yet). I still need to create a sample app to show how I am actually using it. The ETW style metrics can be seen [here] (https://github.com/JoeShook/FusionCacheMetricsPlayground/blob/main/src/JoeShook.FusionCache.EventCounters.Plugin/FusionCacheEventSource.cs). It is my go to metrics plugin at the moment. The other one if the solution is not functional. But again I need to get a sample app in place to show how I am using it.

So now the only thing left in Fusion cache for wiring the PostEvictionDelegate are the following:

FusionCacheEnteryOptions.cs


if (PostEvictionCallbacks.Any())
{
     res.RegisterPostEvictionCallback(PostEvictionCallbacks.First()?.EvictionCallback);         
}

The usage so far looks like this:

            services.AddSingleton<IMemoryCache>(addressMemoryCache);

            services.AddSingleton<IFusionCache>(serviceProvider =>
            {
                var logger = serviceProvider.GetService<ILogger<FusionCache>>();

                var fusionCacheOptions = new FusionCacheOptions
                {
                    DefaultEntryOptions = new FusionCacheEntryOptions
                    {
                        Duration = TimeSpan.FromSeconds(queryApiSettings.CacheInSeconds),
                        Priority = CacheItemPriority.High
                    }
                    .SetFailSafe(true, TimeSpan.FromSeconds(10))
                    .SetFactoryTimeouts(TimeSpan.FromMilliseconds(500), TimeSpan.FromSeconds(10))
                };

                // Future Plugin for hooking metrics ???
                var metrics = new FusionCacheEventSource("address", addressMemoryCache);
                var fusionCache = new FusionCache(fusionCacheOptions, addressMemoryCache, logger);
                // metrics.Wireup(fusionCache);  // glue to wire events
                metrics.Wireup(fusionCache, fusionCacheOptions); // glue to wire events and PostEvictionDelegate

                return fusionCache;
            });
jodydonetti commented 3 years ago

I think the eviction hooks would be most useful for out of the ordinary behavior when troubleshooting strange cache behavior. I think most would always use the FusionCache fail-safe feature. But maybe sometimes they would not.

Makes sense.

Another reason I added it was I did not want to explain to someone why I didn't add it. :)

Ahah I feel you 😄

I really don't know what to expect from these eviction notifications. Mostly I have seen them in other code so I assumed someone would care to see them.

That was my initial thought as well, so I stopped for a moment and wondered if that would really be the case.

On the other hand, even though these eviction events may be used for other things and maybe have specific expectations about the timing of their arrival, here we are talking about metrics and in this regard timing is not that important, but mostly the volume of events, like "how many evictions because of capacity there have been" etc, so it may make sense to have those. At most I may add a note in the docs pointing out the expectations about evictions timings.

I'll play with it a little more and push a commit soon, so you may tell me what you think about it, sounds right?

I worked a little harder at removing much of the Eviction code and moving it to the "plugin". I committed those changes. Also I have a working playground repo to experiment with the metrics plugin ideas (not official plugins yet). I still need to create a sample app to show how I am actually using it. The ETW style metrics can be seen [here] (https://github.com/JoeShook/FusionCacheMetricsPlayground/blob/main/src/JoeShook.FusionCache.EventCounters.Plugin/FusionCacheEventSource.cs).

Whoa, that's a hell of a lot of nice work there 👏, I can see that as full fledged plugin very soon!

So now the only thing left in Fusion cache for wiring the PostEvictionDelegate are the following:

FusionCacheEnteryOptions.cs


if (PostEvictionCallbacks.Any())
{
     res.RegisterPostEvictionCallback(PostEvictionCallbacks.First()?.EvictionCallback);           
}

The usage so far looks like this:

            services.AddSingleton<IMemoryCache>(addressMemoryCache);

            services.AddSingleton<IFusionCache>(serviceProvider =>
            {
                var logger = serviceProvider.GetService<ILogger<FusionCache>>();

                var fusionCacheOptions = new FusionCacheOptions
                {
                    DefaultEntryOptions = new FusionCacheEntryOptions
                    {
                        Duration = TimeSpan.FromSeconds(queryApiSettings.CacheInSeconds),
                        Priority = CacheItemPriority.High
                    }
                    .SetFailSafe(true, TimeSpan.FromSeconds(10))
                    .SetFactoryTimeouts(TimeSpan.FromMilliseconds(500), TimeSpan.FromSeconds(10))
                };

                // Future Plugin for hooking metrics ???
                var metrics = new FusionCacheEventSource("address", addressMemoryCache);
                var fusionCache = new FusionCache(fusionCacheOptions, addressMemoryCache, logger);
                // metrics.Wireup(fusionCache);  // glue to wire events
                metrics.Wireup(fusionCache, fusionCacheOptions); // glue to wire events and PostEvictionDelegate

                return fusionCache;
            });

The overall approach seems correct to me except for the part where you hook into the default entry options' post eviction callbacks: the reason I'm saying this is because the default entry options will be used only when calling the cache methods' overloads with the lambda, like options => options.SetDuration(...) but it someone would call it directly with a FusionCacheEntryOptions object the callbacks would not be in place.

Let me give it a try, I think I know how to align everything with the overall shape of FusionCache, considering all these edge cases and whatnot.

I'll get back to you with a commit later today, so you can tell me what you think.

jodydonetti commented 3 years ago

Looking at the EvictionReason enum here I see these possible values:

Since I would like to support the metrics scenario stated above, but don't want to be too much tied to the internals of the (current) MemoryCache impl, I'm thinking about this:

So my current idea is to expose some of the new events you created, but instead of doing that on the FusionCacheBaseEventsHub (which is inherited by memory, distributed, etc) I'll do that on the FusionCacheMemoryEventsHub which is specific for the memory layer.

Then the hook to the PostEvictionCallbacks would be just internal (no need to expose that directly to the outside world, we'll use the events for that) and internally dispatch some of those to the high-level events: Capacity would flow to the related event, Expired would do the same, Removed would be ignored (because of the already existing, more specific event) and the others would probably be ignored since they are too specific to the internal impl.

I'll push a commit in the afternoon as a first draft.

jodydonetti commented 3 years ago

While trying to implement it, I'm thinking about it a little more deeply: maybe the best (and simplest) way to approach this would be to just have 1 additional event in the memory-specific events class (FusionCacheMemoryEventsHub), like Eviction. That would be that way to make those events flow from inside the memory layer up to the outside world.

I'm playing with it right now, will let you know how it plays out.

jodydonetti commented 3 years ago

Ok I've done a first commit with a possible impl.

I tried to reduce changes' impact and the exposure of memory specific stuff at the bare minimum, while hopefully providing the desired result. Based on your own previous work with the additional events, I've simply reduced them to only one (Eviction) that maps directly with the MemoryCache's PostEvictionCallback dance, including passing a reason param of exactly the same type to ease usage for people already knowing that approach.

A couple more points:

Finally, the way it can be used would be something like this:


cache.Events.Memory.Eviction += (sender, e) =>
{
    // MAYBE NEEDED, MAYBE NOT
    var cache = (IFusionCache)sender;

    // DO STUFF BASED ON WHAT YOU NEED + THE REASON
    switch (e.Reason)
    {
        case EvictionReason.Expired:
            // DO YOUR THING HERE...
            break;
        case EvictionReason.Capacity:
            // DO YOUR THING HERE...
            break;
        // case etc...
    }
};

And of course this would be done, very soon 😎, in a plugin's Start/Init method or whatever will be called.

What do you think?

jodydonetti commented 3 years ago

I also found why the tests were not passing. The reason it's really highly technical, and that is that I'm super dumb 🤣 :

image

I'm passing the bool "is valid" as the value for the param "is stale", therefore inverting the meaning of it.

Fixing right now, will push a commit in a moment.

jodydonetti commented 3 years ago

Here we go 🎉

image

I've also added a couple of additional delay/actions in the tests, to include expiration of throttled entries (so, like, double expiration) and it seems to work.

JoeShook commented 3 years ago

Wow I have a bunch to look at today. I read through everything above and I agree with it all. I kept trying to craft an email explaining the "is valid" vs "is stale" semantics miss-match then I would run a test and things would pass where I didn't expect to pass. And it was late.

On Thu, May 27, 2021 at 7:20 AM Jody Donetti @.***> wrote:

Here we go 🎉

[image: image] https://user-images.githubusercontent.com/1010086/119842802-49470700-bf07-11eb-82fe-7d9e2f9fc6ea.png

I've also added a couple of additional delay/actions in the tests, to include expiration of throttled entries (so, like, double expiration) and it seems to work.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jodydonetti/ZiggyCreatures.FusionCache/pull/18#issuecomment-849675927, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALR4SL52UB7AXWM2HTTAJLTPZIMVANCNFSM45LEKJXA .

JoeShook commented 3 years ago

I see where the confusion is: GetOrSet: your factory in this case returns null, which is a perfectly valid value for the type string just like 0 is for the type int (also they are the default values for their respective types).

Ignoring the null test, I also have await cache.GetOrSetAsync<string>("foo", async _ => "Stuff");

And it also has two MISS's. As a consumer this would be my normal way to call fusion cache. I would expect a single MISS for that call but it issues two.

jodydonetti commented 3 years ago

As a consumer this would be my normal way to call fusion cache. I would expect a single MISS for that call but it issues two.

You are right, this makes total sense: maybe you subscribed to the memory layer events instead of the main one? I mean cache.Events.Memory.Miss instead of cache.Events.Miss ?

To get high level events (let's say the "logical" events) you should subscribe to cache.Events, because if you hook into memory or distributed events you are hooking into lower level parts of FusionCache, so you'll see the inner workings, like the double check around the locking part for example, which may cause 2 misses, or 1 miss and 1 hit in case somebody set a value before the lock has been acquired.

This should also be clearly specified in the docs I'll write, so thanks for pointing this out.

Let me know if this solves your issue.

jodydonetti commented 3 years ago

... or maybe not 🤔 ?

I've just looked better at your test code and you seem to be hooking, in fact, at the main events.

I'll have a deeper look at it asap.

JoeShook commented 3 years ago

... or maybe not 🤔 ?

I've just looked better at your test code and you seem to be hooking, in fact, at the main events.

I'll have a deeper look at it asap.

I have two tests in there and each one I hook to a different hub.

jodydonetti commented 3 years ago

Hi, I've been able to spot where the issue was: in case a GetOrSet did not found the entry and called the factory, I was not calling the corresponding MISS event (in case of no entry) or the HIT STALE (in the other case).

Also, at the end of that part of the logic, when internally saving the entry back to the cache, I was doing a HIT event instead of a SET event.

Now those cases should respect the expectations.

I've also restored and expanded the original "combo" test, and I've added a couple of specific tests for the GetOrSet method, for both this cases:

In the next days I will add more tests to cover other common use cases.

What do you think?

JoeShook commented 3 years ago

This branch is looking good. My personal experiments are going very well. I think this feature is ready to make it's way into the main branch.