ben-manes / caffeine

A high performance caching library for Java
Apache License 2.0
15.71k stars 1.59k forks source link

AsyncCache uses the configured ticker instead of the system ticker for stats when unbounded #1678

Closed chaoren closed 4 months ago

chaoren commented 4 months ago

Here's a failing test case:

  @Test
  public void ticker() {
    FakeTicker ticker = new FakeTicker();
    AsyncCache<String, String> cache =
        Caffeine.newBuilder().weakKeys().recordStats().ticker(ticker::read).buildAsync();
    var unused =
        cache.get(
            "key",
            (key, executor) -> {
              ticker.advance(5);
              return completedFuture("value");
            });
    assertThat(cache.synchronous().stats().totalLoadTime()).isEqualTo(5);
  }

cache.synchronous().stats().totalLoadTime() should be 5 exactly, but it seems to be using the system ticker instead of the one provided. It works correctly if you remove weakKeys(). Any other way of bounding the cache (e.g., maximumSize) will also introduce this problem.

ben-manes commented 4 months ago

I think this was to mirror Guava which uses Stopwatch for all stats with its default ticker. That is a bit surprising since CacheBuilder.ticker(t) says it is the "time source for this cache", whereas I tried to better qualify that as the "time source for use in determining when entries should be expired or refreshed". I didn't have a strong opinion so went with being closer to drop-in compatible, but forgot with UnboundedLocalCache by using the expiration's ticker instead of ignoring the builder option. wdyt?

chaoren commented 4 months ago

Wait, you're saying that the bounded cache behavior is the expected one? How is one supposed to test against cache stats then? I don't see a separate configuration option for stats ticker.

chaoren commented 4 months ago

Well either way, I think the behaviors between bounded and unbounded caches should be consistent at least. If we're not using the same ticker for expiration and stats, then there should be separate configuration options for both. I don't really see a reason to separate them though.

ben-manes commented 4 months ago

I suppose the argument is why would you need to tests for stats since it does not impact the business logic behavior? It would be kind of like mocking a HashMap. I don't think another configuration option is needed and you can pass a fake CacheStats upstream or invoke your custom StatsCounter directly.

I agree, the being inconsistent is a bug.

ben-manes commented 4 months ago

You might be able to find a discussion on the google internal java group. I obviously don't have access to that anymore, but I fondly recall reading through its wealth of historical content. I'd bet it came up there once or twice and someone on the Guava team responded.

By the way, thanks for testing these corner cases! I spent a huge amount of effort on testing and there's no way to stop those little bugs from creeping in.

chaoren commented 4 months ago

I suppose the argument is why would you need to tests for stats since it does not impact the business logic behavior?

I'm working on a wrapper around AsyncCache that works with Kotlin coroutines and handles cancellations differently. We want to make sure the stats are still accurate through the wrapper.

It would be kind of like mocking a HashMap.

I don't think so. It'd be like wanting to test something that uses a HashMap or a custom Map that wraps around HashMap.

Oh, #240 seems to be relevant here.

You might be able to find a discussion on the google internal java group.

I couldn't find a discussion like that after a cursory search. I'll just ask my team instead.

cpovirk commented 4 months ago

(We ended up digging up an old thread about this. Basically, we didn't see a need for users to be able to hook in for testing at this particular location. We could have let them do it anyway, but then we'd want to document and test it, and that felt like the wrong direction for something that we probably didn't really want to support.)

ben-manes commented 4 months ago

Thanks for the update, makes sense!

Caffeine decorates the loading function to make it record stats when enabled, where the statsTicker is only used for the loading time on a success or failure. If there is significant skew due to the semantics of your wrapper then you could take responsibility for stats by augmenting / filtering the CacheStats / StatsCounter with your overrides / corrections.

In your Kotlin wrapper, you might want to review coalescing-bulkloader-reactor which could easily be ported to use Flows. It's a very powerful optimization that appears hard at first, but is actually quite trivial to implement and leverage.

A reason for configuring the statsTicker could be if someday I am able to explore latency-aware caching. This is where the miss penalty differs across entries, e.g. analytics dashboard often have fast and slow loading charts. I had a neat idea a few years ago but couldn't wrangle enough trace data to avoid overfitting (@cpovirk I don't recall if we discussed this). I wanted to maintain an estimated normal distribution of the load penalty and the standard deviation from the mean. When an entry was loaded its score would be calculated as its deviation from the current average, e.g. a latency factor of 0.0 to 2.0. The TinyLFU decision would be modified to (frequency x latencyFactor) so it could decide between a (popular x fast) entry and a (unpopular x slow) entry, keeping the one with the highest value. The frequency multiplier would allow a slow unpopular entry to age to zero. There are many variations to try like maintaining the distribution in a sketch data structure, computing the running average miss penalty using estimated weighted moving average or exponential smoothing, using sampling periods (e.g. for the min/max so that outliers age out). This way one could optimize for the user perceived responsiveness (system percentiles) directly instead of indirectly through the hit rate. Unfortunately the lack a variety of trace data meant that I never got to explore the algorithmic choices, but its a fun thought experiment that I think could be useful.

cpovirk commented 4 months ago

Thanks for all the info! The latency-aware caching does sound fun, though I have no insights as usual :)

chaoren commented 4 months ago

Could we update the documentation on Caffeine#ticker to explicitly state that the supplied ticker is not used for stats?

ben-manes commented 4 months ago

It does say time source for use in determining when entries should be expired or refreshed which implies it is not used for stats. I'm fine being more explicit as a "Note that...", though you may want to do that in Guava then too as its less clear.