sksamuel / aedile

Kotlin Wrapper for Caffeine
Apache License 2.0
164 stars 15 forks source link

ThreadContextElements are lost in compute function of Caches #11

Closed JordanBerginSkedulo closed 11 months ago

JordanBerginSkedulo commented 1 year ago

Hello there, I've been migrating away from caffeine with blocking functions to this lovely library with non-blocking ones, but ran into an issue around ThreadContextElements.

We use some libraries that rely on thread locals (boo) but also use Kotlin Coroutines with ThreadContextElements (yay) to keep everything working. The issue I've run into is when using an instance of com.sksamuel.aedile.core.Cache, the ThreadContextElement is lost and so the compute function that worked with blocking functions don't work (without the workaround shown below).

Hopefully the following illustrates the situation sufficiently:

package com.example

import com.sksamuel.aedile.core.caffeineBuilder
import kotlinx.coroutines.ThreadContextElement
import kotlinx.coroutines.withContext
import kotlin.coroutines.CoroutineContext

val helloThreadLocal: ThreadLocal<String?> = ThreadLocal.withInitial { null }

class Hello: ThreadContextElement<String?> {
    companion object HelloKey : CoroutineContext.Key<Hello>
    override val key : CoroutineContext.Key<Hello> = HelloKey

    override fun updateThreadContext(context: CoroutineContext): String? {
        return helloThreadLocal.get().also {
            helloThreadLocal.set("hello")
        }
    }

    override fun restoreThreadContext(context: CoroutineContext, oldState: String?) {
        helloThreadLocal.set(oldState)
    }
}

suspend fun main() {
    val cache = caffeineBuilder<String, String?>().build()

    //begin coroutine with thread local context "hello" set
    withContext(Hello()) {

        //try to retrieve the hello thread local using cache
        val result = cache.get("Hello") {
            helloThreadLocal.get()
        }

        //this prints null because context is lost inside compute function :(
        println(result)

        //however, if we capture the context now and use it inside the cache's compute function...
        val callingContext = coroutineContext
        val resultUsingCallingContext = cache.get("Hello") {
            withContext(callingContext) {
                helloThreadLocal.get()
            }
        }

        //hello there!
        println(resultUsingCallingContext)
    }
}

The result printed is null, then hello. As you can see we are able to get around this by capturing the context of the caller and then use it inside the compute function, which seems to work!

I've noticed scope and dispatcher are configurable, I wanted to ask if it would be possible to either configure the cache to use the caller context (e.g. useCallingContext = true or something), or do it by default?

As a further note - I don't know if such a suggestion would make sense for compute functions provided when the cache is built (i.e when loading cache is used)? I am specifically asking about when you supply the compute function at the call site.

Thanks!

Edit: Oh and to clarify, if this is something you'd agree to add into this library I'm more than happy to submit a PR for it!

sksamuel commented 1 year ago

Hi.

The underlying library expects the compute function to return a CompletableFuture. So inside Aedile, we do this:

   suspend fun get(key: K, compute: suspend (K) -> V): V {
      return cache.get(key) { k, _ -> scope.async { compute(k) }.asCompletableFuture() }.await()
   }

I'm not sure how we can have the compute function use the same context as the caller. Perhaps something like this would work:

   suspend fun get(key: K, compute: suspend (K) -> V): V {
      val context = coroutineContext
      return cache.get(key) { k, _ -> CoroutineScope(context).async { compute(k) }.asCompletableFuture() }.await()
   }
JordanBerginSkedulo commented 1 year ago

Yes I did see a scope is created at build time of the cache, so it would require deferring scope creation to the get in order to work I believe!

I think it does add some complexity in that the scope used for loading caches would necessarily be different from the one in these get functions? And I'm not sure if I'm tunnel visioning my use that is preventing me from seeing how that might trip up how someone else would expect this library to behave? Like does it have any impact on structured concurrency? I'm a user of coroutines by no means an expert!

At the very least I imagine a middle ground might be a section in the readme, or an example which just highlights these ThreadContextElements get lost, and here's how you get around this?

Or just thinking out loud, an extension function that is explicit getInUsingCurrentContext? Actually thinking about it might do that for our code anyway 😂 .

Anyway thank you for your response!

sksamuel commented 1 year ago

I think we can add a config parameter to the cache builder to allow for the use of the calling coroutine context, using the code I pasted up.

I think that would solve your problem.

JordanBerginSkedulo commented 1 year ago

So I've had a stab at it in https://github.com/sksamuel/aedile/pull/12, went for any function where compute is passed at the call site and made use of the + operator with the scope created from the configuration (so I think this preserves things like the choice of dispatcher. Give me a shout either way, and if you think a section in the readme would be wanted? Have a nice weekend!

sksamuel commented 11 months ago

I took your PR and adapted it, and released 1.3.0-RC1 if you want to give it a go.!

JordanBerginSkedulo commented 11 months ago

I took your PR and adapted it, and released 1.3.0-RC1 if you want to give it a go.!

Oh wow, I've been so busy lately this dropped right off my radar, I am very sorry for the radio silence.

I'll give it a go this week hopefully! You're a star!

JordanBerginSkedulo commented 11 months ago

Gave this a go and it does seem to work. The snippet of code above now works as expected. I'll give an update once I've tried it out in the various places we currently have workarounds in place :-)

JordanBerginSkedulo commented 11 months ago

Managed to try this out, with 1.3.0-RC1:

So all good as far as my initial problem is concerned.

One thing worth mentioning is the upgrade from kotlinx.coroutines from 1.6 -> 1.7 that 1.3.0-RC1 comes with did cause my tests to fail unexpectedly. I think this is because we use the Gradle test suite plugin which only adds the main classpath to the testRuntimeClasspath. Upgrading to 1.3.0-RC1 on the main source set (but not on my test source set as it doesn't use aedile) caused my test to use 1.6 coroutines to compile, but 1.7 when running.

This resulted in the runTest from the coroutines-test library to report tests had failed as there was "still an active scope after 0ms", when the default is 60000ms, just a little bit confusing!

I'm aware this isn't anything that's wrong with 1.3.0-RC1, just mentioning is more in case anyone else experiences this and reaches out.

sksamuel commented 11 months ago

Ok so I'll release 1.3.0 final then !