quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.43k stars 2.58k forks source link

Allow access to #asMap method from Caffeine cache #38540

Open ebullient opened 5 months ago

ebullient commented 5 months ago

Description

I have a case where I would like to be able to "computeIfPresent" (a condition not possible with current Quarkus Cache annotations.

We can do cache.as(CaffeineCache.class) ... all I would need is for this: cache.as(CaffeineCache.class).asMap() to exist

Implementation ideas

Caffeine already has the asMap() method, just need to add/expose it via CaffeineCache

cc: @mkouba

quarkus-bot[bot] commented 5 months ago

/cc @gwenneg (cache)

mkouba commented 5 months ago

Maybe, it would make more sense to add "computeIfPresent" and "compute" methods to the io.quarkus.cache.Cache interface? :thinking:

@gwenneg WDYT?

ben-manes commented 5 months ago

Since quarkus uses an AsyncCache, the mapping is to a CompletableFuture. Is that what you want to compute on, as it could be in-flight?

The synchronous view’s asMap might be what is wanted, as it uses a retry loop and waits outside of the computation.

The adapter uses wrappers for conversions, e.g. nulls can be cached by Quarkus. What does that mean for a computation and would you need to provide a map view that hides the conversations?

Is any of that the api you’d want in your shared Cache interface? It’s actually a tricky api design choice.

mkouba commented 5 months ago

Since quarkus uses an AsyncCache, the mapping is to a CompletableFuture. Is that what you want to compute on, as it could be in-flight?

Quarkus Cache interface does not use CompletableFuture but Uni. There are two method variants for the "computing" function. get() is using a sync computing function and getAsync() is using an async variant based on Uni. Note that the names are not ideal because in both cases an async type is returned. So the "async" in the name means async computation. Ideally, we would have two get() methods, one accepting something like SyncComputation and the second AsyncComputation, and not java.util.function.Function. But this ship has sailed long ago :shrug:

The adapter uses wrappers for conversions, e.g. nulls can be cached by Quarkus. What does that mean for a computation and would you need to provide a map view that hides the conversations?

AFAIK Quarkus does not store null but a special constant: io.quarkus.cache.runtime.NullValueConverter.NullValue.INSTANCE. And yes, we would need to hide this from the user.

Is any of that the api you’d want in your shared Cache interface? It’s actually a tricky api design choice.

In any case, I assume that we would need to expose two variants for compute() and computeIfPresent() too.

ben-manes commented 5 months ago

The point being you have an async value as the mapping. If you want an atomic computation, which is likely, that adds complexity to consider for what is the atomic scope and where blocking occurs.

The cached value might be a null sentinel or the CaffeineComputationThrowable error. The implication is that hiding it is more work than the request or guessed.

If promoted to the general interface then weaker atomicity might be required by another cache implementation.

All of this was to highlight the unexpected complexity of the ask so you can avoid unpleasant surprises if you embark on the feature.

mkouba commented 5 months ago

If promoted to the general interface then weaker atomicity might be required by another cache implementation.

Yes, we might end up adding these methods to the io.quarkus.cache.CaffeineCache interface only.

All of this was to highlight the unexpected complexity of the ask so you can avoid unpleasant surprises if you embark on the feature.

Yes, I'm pretty sure it would be complex but that should not prevent us from discussing the topic, right? ;-)

My point is - exposing AsyncCache.asMap() or even AsyncCache.synchronous().asMap() might be tricky as well because we hide the Caffeine API for some reason. Otherwise we would not need our abstraction and just make the underlying AsyncCache injectable. But take my notes with a grain of salt - I'm not an expert in this area.

ben-manes commented 5 months ago

Yep, nothing against discussing and just remarking on gotchas. For example that null mappings are confusing for a Map compute as an input or output, since null no longer just means absent or to remove.

I don’t know when it’s best to say the complexity exceeds the abstraction and to use the vendor’s api for advanced cases. This might be one of your cutoffs.

ebullient commented 5 months ago

In my head: this is an intentional use of the vendor API .as(CaffeineCache..).asMap .. not general case.

I confess to not knowing a lot about what I'm doing in general. If there is a better way to use the Cache synchronously and still have config and annotations, I'd be all for that (we have no docs for that case). I'm not using async APIs otherwise, so dealing with CompletableFutures and Unis (while doable) is also... annoying.

ben-manes commented 5 months ago

I think it was a very reasonable ask and expectation that it would be a simple change. I stepped in simply because I helped the team previously coerce Caffeine to get the behavior they wanted from the integration. They wanted async computations, lock timeouts, null values, and minimal blocking of the underlying thread (I think you use an event loop?). So the adapter is more advanced than you’d think and @gwenneg did a nice job juggling all of that.

Caching nulls is probably the hardest quirk so ConcurrentHashMap disallows it. Caffeine just punts that by suggesting a users incorporate a sentinel value (i.e. Optional wrapper), so it’s a proper value to inspect on for api callbacks. It makes negative caching explicit and as a less common usage pattern we get little pushback. But once abstractions pretend it’s a cached null value it makes the Map interface very difficult to expose.

I would guess that using Caffeine directly and skipping the integration might make the most sense. I also don’t know what I’m talking about by having little direct knowledge on your team’s best practices or if you can reuse any of the configuration, monitoring, etc that comes for free in your set up.

ebullient commented 2 months ago

I did bypass quarkus integration and just use Caffeine directly.

Do we want to leave this open as a someday/maybe, or close it?