quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.36k stars 2.56k forks source link

`ContextNotActiveException` in `SecurityIdentityAugmentor` since Quarkus 3.10 #41081

Closed mschorsch closed 1 day ago

mschorsch commented 3 weeks ago

Describe the bug

Since Quarkus 3.10 we get a ContextNotActiveException in our own SecurityIdentityAugmentor.

2024-06-01 15:59:51,328 ERROR [com.example.fre.ser.ExceptionMappers] (vert.x-eventloop-thread-3) Es ist ein unbehandelter Fehler aufgetreten.: jakarta.enterprise.context.ContextNotActiveException: RequestScoped context was not active when trying to obtain a bean instance for a client proxy of CLASS bean [class=io.quarkus.vertx.http.runtime.CurrentVertxRequest, id=0_6n6EmChCiiDdd8HelptG_A0AE]
    - you can activate the request context for a specific method using the @ActivateRequestContext interceptor binding
    at io.quarkus.arc.impl.ClientProxies.notActive(ClientProxies.java:70)
    at io.quarkus.arc.impl.ClientProxies.getSingleContextDelegate(ClientProxies.java:30)
    at io.quarkus.vertx.http.runtime.CurrentVertxRequest_ClientProxy.arc$delegate(Unknown Source)
    at io.quarkus.vertx.http.runtime.CurrentVertxRequest_ClientProxy.setCurrent(Unknown Source)
    at io.quarkus.resteasy.reactive.server.runtime.QuarkusCurrentRequest.set(QuarkusCurrentRequest.java:33)
    at org.jboss.resteasy.reactive.server.core.CurrentRequestManager.set(CurrentRequestManager.java:12)
    at org.jboss.resteasy.reactive.server.core.ResteasyReactiveRequestContext.handleRequestScopeActivation(ResteasyReactiveRequestContext.java:654)
    at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.handleRequestScopeActivation(QuarkusResteasyReactiveRequestContext.java:39)
    at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.requireCDIRequestScope(AbstractResteasyReactiveContext.java:264)
    at org.jboss.resteasy.reactive.server.handlers.ResponseWriterHandler.handle(ResponseWriterHandler.java:27)
    at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:147)
    at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147)
    at org.jboss.resteasy.reactive.server.vertx.VertxResteasyReactiveRequestContext$1$1.handle(VertxResteasyReactiveRequestContext.java:78)
    at org.jboss.resteasy.reactive.server.vertx.VertxResteasyReactiveRequestContext$1$1.handle(VertxResteasyReactiveRequestContext.java:75)
    at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:279)
    at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:261)
    at io.vertx.core.impl.ContextInternal.lambda$runOnContext$0(ContextInternal.java:59)
    at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:1583)

We have not yet been able to create a reproducer, but we hope that a Quarkus expert can help with the following sample code. We were able to narrow down the problem, the exception always occurs when the cache is updated.

import com.github.benmanes.caffeine.cache.Caffeine
import com.github.benmanes.caffeine.cache.LoadingCache
import io.quarkus.oidc.client.filter.OidcClientFilter
import io.quarkus.security.identity.AuthenticationRequestContext
import io.quarkus.security.identity.SecurityIdentity
import io.quarkus.security.identity.SecurityIdentityAugmentor
import io.quarkus.security.runtime.QuarkusSecurityIdentity
import io.smallrye.jwt.auth.principal.JWTCallerPrincipal
import io.smallrye.mutiny.Uni
import jakarta.enterprise.context.ApplicationScoped
import jakarta.inject.Inject
import jakarta.ws.rs.GET
import jakarta.ws.rs.Path
import jakarta.ws.rs.PathParam
import jakarta.ws.rs.Produces
import jakarta.ws.rs.core.MediaType
import org.eclipse.microprofile.jwt.Claims
import org.eclipse.microprofile.rest.client.inject.RegisterRestClient
import org.eclipse.microprofile.rest.client.inject.RestClient
import java.time.Duration

data class ExampleRolesResponse(val roles: List<String>)

@RegisterRestClient
@OidcClientFilter
@ApplicationScoped
interface ExampleRestClient {

    @GET
    @Produces(MediaType.APPLICATION_JSON)
    @Path("auth/{email}")
    fun getRoles(@PathParam("email") email: String): Uni<ExampleRolesResponse>
}

@ApplicationScoped
class ExampleAuthorizationProvider @Inject constructor(
    @RestClient private val client: ExampleRestClient,
) {

    private val cache: LoadingCache<String, Uni<ExampleRolesResponse>> = Caffeine.newBuilder()
        .expireAfterWrite(Duration.ofMinutes(1L))
        .build { email -> client.getRoles(email).memoize().indefinitely() } // the exception is thrown here

    fun getRoles(email: String): Uni<ExampleRolesResponse> {
        return cache.get(email)
    }
}

@ApplicationScoped
class ExampleSecurityIdentityAugmentor @Inject constructor(
    private val authProvider: ExampleAuthorizationProvider,
) : SecurityIdentityAugmentor {

    override fun augment(identity: SecurityIdentity, context: AuthenticationRequestContext): Uni<SecurityIdentity> {
        val email = identity.email()
        if (identity.isAnonymous || email == null) {
            return Uni.createFrom().item(identity)
        }

        return authProvider.getRoles(email)
            .map { response -> identity.createFrom(response) }
    }

    private fun SecurityIdentity.email(): String? {
        return (principal as? JWTCallerPrincipal)
            ?.getClaim<String>(Claims.email)
            ?.lowercase()
    }

    private fun SecurityIdentity.createFrom(response: ExampleRolesResponse): SecurityIdentity {
        val newAttributes = this.attributes + ("example.response" to response)

        return QuarkusSecurityIdentity.builder()
            .addAttributes(newAttributes)
            .addCredentials(this.credentials)
            .addRoles(response.roles.toSet())
            .addPermissionChecker { permission -> this.checkPermission(permission) } // sustain previous permission checks
            .setPrincipal(this.principal)
            .setAnonymous(this.isAnonymous)
            .build()
    }

    override fun priority(): Int = 1000
}

Expected behavior

No response

Actual behavior

No response

How to Reproduce?

No response

Output of uname -a or ver

Linux

Output of java -version

Java 21

Quarkus version or git rev

Quarkus 3.11.1

Build tool (ie. output of mvnw --version or gradlew --version)

No response

Additional information

No response

quarkus-bot[bot] commented 3 weeks ago

/cc @geoand (kotlin), @sberyozkin (security)

geoand commented 3 weeks ago

Hi,

By just eyeballing the code it's extremely hard to spot what's going on. If you do come up with a reproducer, that would be immensely helpful.

sberyozkin commented 3 weeks ago

@mschorsch It does look like that accessing cache now requires a request context, which is not available in SecurityIdentityAugmentor, which is not really a bug.

Can you try @ActivateRequestContext, as shown in https://quarkus.io/guides/security-customization#security-identity-customization, 3rd example there ?

mschorsch commented 3 weeks ago

Hi @sberyozkin , thanks for your reply.

The exception is only thrown on the second access of the cache, i.e. when the cache is updated.

Can you try @ActivateRequestContext, as shown in https://quarkus.io/guides/security-customization#security-identity-customization, 3rd example there ?

Yes sure, thanks for the hint. Can you give me a hint which instance should be injected? In the example the SecurityIdentity is mentioned, what would I have to change in my case? Which method should I set @ActivateRequestContext to?

sberyozkin commented 3 weeks ago

Hi @mschorsch Refactor the code above as shown in the example, have security augmentation done in a supplier, etc

mschorsch commented 3 weeks ago

@sberyozkin Thank you for your help. I understand the documentation and what it is supposed to achieve. However, I don't understand how this would help in my use case, nor how to implement it. Contrary to my intuition, I still tried to adapt the example in the documentation to my use case, but unfortunately this did not help.

Fortunately, I managed to create a simple reproducer (cc @geoand). I hope this helps with the analysis. Thanks again.

Steps to reproduce

  1. Download reproducer.zip
  2. ./mvnw clean install
  3. ./mvnw quarkus:dev
  4. Open a Browser http://localhost:8080/hello
  5. Refresh the page permanently (F5)
  6. seq 1 200 | xargs -n1 -P10 curl "http://localhost:8080/hello"
geoand commented 3 weeks ago

I could not reproduce this on my machine, but I have a suggestion:

Try using the programmatic API Quarkus provides for accessing the cache instead of using Caffeine directly. The former should properly handle the context switches which the latter knows nothing about.

mschorsch commented 3 weeks ago

@geoand You are right, I had only tested under Windows. The case could not be reproduced under Linux. For this reason, I have adjusted the ‘Steps to reproduce’. With the change, the case can be reproduced without any problems.

Can you please test it again?

geoand commented 3 weeks ago

I tried and still could not get it to fail. Regardless however, please take a look at my suggestion above.

mschorsch commented 3 weeks ago

@geoand I tried your suggestion with the following change in my reproducer an unfortunaly it doesnt't change anything, i still get the Exception:

@ApplicationScoped
class MySecurityIdentityAugmentor @Inject constructor(
    @RestClient private val exampleRestClient: ExampleRestClient,
) : SecurityIdentityAugmentor {

    @Inject
    @CacheName("my-cache")
    lateinit var cache: Cache

    @Inject
    lateinit var identitySupplierInstance: Instance<SecurityIdentitySupplier>

    override fun augment(identity: SecurityIdentity, context: AuthenticationRequestContext): Uni<SecurityIdentity> {
        val identitySupplier = identitySupplierInstance.get()
        identitySupplier.setIdentity(identity)

        return getNonBlockingExpensiveValue("test").map { _ ->
            identitySupplier.get()
        }
    }

    private fun getNonBlockingExpensiveValue(key: String): Uni<String> {
        return cache.getAsync(key) { _ ->
            exampleRestClient.getHtml()
        }
    }
}

@Dependent
class SecurityIdentitySupplier : Supplier<SecurityIdentity> {

    private var identity: SecurityIdentity? = null

    @ActivateRequestContext
    override fun get(): SecurityIdentity {
        return QuarkusSecurityIdentity.builder(identity)
            .addRole("user")
            .build()
    }

    fun setIdentity(identity: SecurityIdentity) {
        this.identity = identity
    }
}
# application.yaml
quarkus.rest-client."example-client".url=https://www.google.com
quarkus.cache.caffeine."my-cache".expire-after-write=PT0.1S

With the reproducer described in https://github.com/quarkusio/quarkus/issues/41081#issuecomment-2161248917 i can reliable reproduce the problem.

geoand commented 3 weeks ago

Thanks for checking.

I'll have a look tomorrow.

mschorsch commented 3 weeks ago

@geoand Thanks

I've updated the reproducer:

  1. Download reproducer2.zip
  2. ./mvnw clean install
  3. ./mvnw quarkus:dev
  4. seq 1 200 | xargs -n1 -P10 -i curl "http://localhost:8080/hello"
geoand commented 3 weeks ago

I was able to trigger the issue (some of the time) using the steps above.

geoand commented 3 weeks ago

It seems like the problem starts occuring in 3.10.1 - 3.10.0 works correctly.

That means we need to do a git bisect between those two version and figure out which commit introduces the problem. Unfortunately I won't have time for this for at least the next 10 days, so if someone wants to continue the investigation, they are are welcome to.

P.S. I made one small change to the reproducer:

@ApplicationScoped
class MySecurityIdentityAugmentor(
    @RestClient private val exampleRestClient: ExampleRestClient,
) : SecurityIdentityAugmentor {

    @Inject
    @CacheName("my-cache")
    lateinit var cache: Cache

    @Inject
    lateinit var identitySupplierInstance: Instance<SecurityIdentitySupplier>

//    private val cache: LoadingCache<String, Uni<String>> = Caffeine.newBuilder()
//        .expireAfterWrite(Duration.ofMillis(100L))
//        .build { _ ->
//            exampleRestClient.getHtml().memoize().indefinitely()
//        }

    override fun augment(identity: SecurityIdentity, context: AuthenticationRequestContext): Uni<SecurityIdentity> {
        val identitySupplier = identitySupplierInstance.get()
        identitySupplier.setIdentity(identity)

        return getNonBlockingExpensiveValue("test").map { _ ->
            identitySupplier.get()
        }
    }

    private fun getNonBlockingExpensiveValue(key: String): Uni<String> {
        return cache.getAsync(key) { _ ->
            exampleRestClient.getHtml()
        }
        // return cache.get(key)
    }
}

@Dependent
class SecurityIdentitySupplier : Supplier<SecurityIdentity> {

    private var identity: SecurityIdentity? = null

//    @ActivateRequestContext
    override fun get(): SecurityIdentity {
        return QuarkusSecurityIdentity.builder(identity)
            .addRole("user")
            .build()
    }

    fun setIdentity(identity: SecurityIdentity) {
        this.identity = identity
    }
}
mschorsch commented 3 weeks ago

Thanks @geoand for checking.

This issue is a blocker for us and prevents us from upgrading to quarkus > 3.10.0. It would be great If someone could take a look at this.

mschorsch commented 3 weeks ago

Inspired from https://github.com/quarkusio/quarkus/blob/59bad99a294b35e00d7e115e38bf67c4509f60de/extensions/cache/runtime/src/main/java/io/quarkus/cache/runtime/CacheResultInterceptor.java#L79-L121 i've implemented the following solution:

import io.quarkus.cache.Cache
import io.quarkus.cache.CacheName
import io.quarkus.security.identity.AuthenticationRequestContext
import io.quarkus.security.identity.SecurityIdentity
import io.quarkus.security.identity.SecurityIdentityAugmentor
import io.quarkus.security.runtime.QuarkusSecurityIdentity
import io.smallrye.mutiny.Uni
import io.vertx.core.Context
import io.vertx.core.Vertx
import io.vertx.core.impl.ContextInternal
import jakarta.enterprise.context.ApplicationScoped
import jakarta.enterprise.context.Dependent
import jakarta.inject.Inject
import org.eclipse.microprofile.rest.client.inject.RestClient

@ApplicationScoped
internal class MySecurityIdentityAugmentor @Inject constructor(
    private val examexampleService: ExampleService,
) : SecurityIdentityAugmentor {

    override fun augment(identity: SecurityIdentity, context: AuthenticationRequestContext): Uni<SecurityIdentity> {
        return examexampleService.getHtml("test").map { _ ->
            QuarkusSecurityIdentity.builder(identity)
                .addRole("user")
                .build()
        }
    }
}

@Dependent
internal class ExampleService(
    @RestClient private val exampleRestClient: ExampleRestClient,
    @CacheName("my-cache") private val cache: Cache,
) {

    fun getHtml(key: String): Uni<String> {
        val context: Context? = Vertx.currentContext()
        return cache.getAsync(key) { exampleRestClient.getHtml() }
            // Stolen from https://github.com/quarkusio/quarkus/blob/59bad99a294b35e00d7e115e38bf67c4509f60de/extensions/cache/runtime/src/main/java/io/quarkus/cache/runtime/CacheResultInterceptor.java#L79-L121
            .emitOn { command ->
                // We need make sure we go back to the original context when the cache value is computed.
                // Otherwise, we would always emit on the context having computed the value, which could
                // break the duplicated context isolation.

                val ctx: Context? = Vertx.currentContext()
                if (context == null) {
                    // We didn't capture a context
                    if (ctx == null) {
                        // We are not on a context => we can execute immediately.
                        command.run()
                    } else {
                        // We are on a context.
                        // We cannot continue on the current context as we may share a duplicated context.
                        // We need a new one. Note that duplicate() does not duplicate the duplicated context,
                        // but the root context.
                        (ctx as ContextInternal).duplicate().runOnContext { command.run() }
                    }
                } else {
                    // We captured a context.
                    if (ctx == context) {
                        // We are on the same context => we can execute immediately
                        command.run()
                    } else {
                        // 1) We are not on a context (ctx == null) => we need to switch to the captured context.
                        // 2) We are on a different context (ctx != null) => we need to switch to the captured context.
                        context.runOnContext { command.run() }
                    }
                }
            }
    }
}

The declarative approach with @CacheResult (https://quarkus.io/guides/cache#annotations-api) works too.

mschorsch commented 3 weeks ago

Don't know anymore if this is a bug not it is not easy to understand.

geoand commented 3 weeks ago

Thanks for looking into it!

I still think it's worth finding out which commits between 3.10.0 and 3.10.1 causes the original problem

sberyozkin commented 3 weeks ago

Thanks @geoand @mschorsch. @mschorsch, I don't really know much about how Cache works, I browsed through https://github.com/quarkusio/quarkus/commits/3.10.1/, I think if something does impact it, it should be somewhere here:

https://github.com/quarkusio/quarkus/commit/b96e9cdcd17f320c3249bdddc425e7f2229c3340.

It may be nothing to do with Cache, I wonder if an update to ArcThreadsetupAction might be related. Let me CC CDI experts, @mkouba, @manovotn

mschorsch commented 2 weeks ago

The problem is that the quarkus cache with annotations behaves differently than the programmatic approach. Using the annotation based approach restores the context correctly while in the case of the programmatic approach the user (developer) has to take care of it. Ideally, both approaches behave in the same way.

Before quarkus 3.10.1 both approaches showed the same behavior.

geoand commented 2 weeks ago

Yes, we should definitely fix the behavior of the programmatic approach

geoand commented 2 weeks ago

@gwenneg I think we should move the handling of the context from the interceptor to the cache itself - WDYT?

manovotn commented 2 weeks ago

That means we need to do a git bisect between those two version and figure out which commit introduces the problem. Unfortunately I won't have time for this for at least the next 10 days, so if someone wants to continue the investigation, they are are welcome to.

@geoand This is git bisect result of the two versions.

af878781098307b2116938332e11e8c6d1d125b6 is the first bad commit
commit af878781098307b2116938332e11e8c6d1d125b6
Author: Matej Novotny <manovotn@redhat.com>
Date:   Mon Apr 29 18:52:14 2024 +0200

    Quarkus REST - reuse CDI request context if it exists

While it pinpoints the commit, it doesn't really tell me much about the underlying cause as I am unfamiliar with how these related extensions juggle with contexts :-/

geoand commented 2 weeks ago

Thanks @manovotn !

It might just have surfaced the existing problem.

In any case, the proper course of action seems to be what I proposed above to @gwenneg.

geoand commented 1 week ago

@gwenneg can you comment on this?

Thanks

gwenneg commented 4 days ago

Sorry, I only noticed this now. I'm reading all comments.

gwenneg commented 4 days ago

IIUC, we're talking about moving from CacheResultInterceptor to CaffeineCacheImpl the logic that determines which Vert.x context is used to compute a cache value. That would make sense to me, although I'm far from being an expert of the Vert.x context.

geoand commented 3 days ago

Yes, that's the idea