quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.85k stars 2.7k forks source link

Since 3.10.1 RequestContext handling is broken on request cancle #40810

Closed HerrDerb closed 5 months ago

HerrDerb commented 6 months ago

Describe the bug

Issue was introduced with Quarkus REST - reuse CDI request context if it exists https://github.com/quarkusio/quarkus/issues/40307#issuecomment-2088016466

What I think can become problematic is how we handle the end of the request processing in Quarkus REST.

Well I think we found a constellation:

There is a Vertx RouteFilter and a ContainerRequestFilter which both access a @RequestScoped bean. The logic of the reproducer application executes an uninterruptable process which takes a while (in this case a db operation). The test provokes a request timeout hence a request cancle.

On a request cancle, the verx request context gets destroyed immediately but the uninterruptable code keeps executing and finally triggers a ContextNotActiveException.

Before 3.10.1 this wasn't an issue as the ContainerRequestFilter created a new request context, and therefor was not affected by a request cancle.

Expected behavior

On a request cancle the ActiveRequestContext gets destroyed while the request code is still running

Actual behavior

Wait for code completion before destroying the context

How to Reproduce?

Reproducer: https://github.com/HerrDerb/quarkus-issue/tree/context-issue Steps:

  1. Execute GreetingServiceTest

Output of uname -a or ver

No response

Output of java -version

No response

Quarkus version or git rev

No response

Build tool (ie. output of mvnw --version or gradlew --version)

No response

Additional information

No response

quarkus-bot[bot] commented 6 months ago

/cc @Ladicek (arc), @manovotn (arc), @mkouba (arc)

gsmet commented 6 months ago

The PR you're talking about wasn't merged but I think your problem might be due to https://github.com/quarkusio/quarkus/pull/40408 .

I will let the CDI experts chime in.

/cc @rsvoboda let's keep this one in mind to decide if we drop this patch from the backports.

HerrDerb commented 6 months ago

@gsmet Yes, sorry I've mixed it up, that is the one I was talking about.

manovotn commented 6 months ago

@HerrDerb Maybe I don't fully understand it but the use case seems a bit weird. My thoughts on it are below; feel free to express disagreement and/or clarify the use case.

On a request cancle, the verx request context gets destroyed immediately but the uninterruptable code keeps executing and finally triggers a ContextNotActiveException.

On a req. timeout, it seems sensible to destroy req. context as past that point we don't know when or if at all it will get destroyed. IIUIC, this behavior didn't change from past versions.

Before 3.10.1 this wasn't an issue as the ContainerRequestFilter created a new request context, and therefor was not affected by a request cancle.

And more importantly this part you wrote - even if the req. context was re-created, it was a fresh one and therefore any such beans lost their state in the meantime. That is IMO an indicator that your bean shouldn't be req. scoped to begin with. Maybe app scoped or dependent? That way you aren't linking it to lifecycle of a req. that you suspect might end prematurely.

mkouba commented 6 months ago

@gsmet When we talked about this issue I missed the "request timeout" part. I think that I agree with Matej that the request context should be destroyed after the timeout and that the ContextNotActiveException is valid and expected.

HerrDerb commented 6 months ago

Man, that is hard to describe 🙂 @manovotn @mkouba In this case, I think it is definitely not to expect, as the running request (that part that gets dispatched from Vert.x to the Jakarta framework) needs to be either successfully interrupted or finished first before the RequestContext can be destroyed by Vert.x.

I understand that right now, when Vert.x receives a request cancel or timeout, it destroys its RequestContext immediately without waiting for the async task to be completed or to be interrupted (I don't know exactly if it gets interrupted in this case at all, I could't exactly locate the specific code). This leads to a still running process in Jakarta with a destroyed RequestContext.

Before 3.10.1, the dispatched process had its own RequestContext, therefore was not affected by this and could finish in an expected manner.

So what I am suggesting is: While reusing the RequestContext is not the general issue, it should not be destroyed immediately on request cancel by Vert.x. In fact, it should only be destroyed after the dispatched Jakarta process has come to an end (either finished or interrupted).

With the change of 3.10.1 this leads to unexected behaviour.

Does this seem understandable?

joggeli34 commented 6 months ago

For me the interesting part is, that adding a simple RouteFilter to the project will lead to the described exception. When the filter is removed, no exception happens. This can be checked in the reproducer from @HerrDerb really easy by just commenting out the @RouterFilter line.

In my opinion, the behavior should be the same with and without an empty vertx filter and I tend to the plain behavior currently happening without the filter.

mkouba commented 6 months ago

In this case, I think it is definitely not to expect, as the running request (that part that gets dispatched from Vert.x to the Jakarta framework) needs to be either successfully interrupted or finished first before the RequestContext can be destroyed by Vert.x.

I'm not an expert in this area but AFAIK you cannot easily interrupt a running thread unless the code checks Thread.isInterrupted() and quits itself. And I believe that the Vert.x behavior is correct otherwise the timeout would be useless.

Before 3.10.1, the dispatched process had its own RequestContext, therefore was not affected by this and could finish in an expected manner.

Yes, and we identified this as an incorrect behavior :shrug:.

So what I am suggesting is: While reusing the RequestContext is not the general issue, it should not be destroyed immediately on request cancel by Vert.x. In fact, it should only be destroyed after the dispatched Jakarta process has come to an end (either finished or interrupted).

I tend to disagree but we should probably discuss this more broadly.

CC @geoand

Does this seem understandable?

I do understand your expectations but I think that your use case worked just by accident.

For me the interesting part is, that adding a simple RouteFilter to the project will lead to the described exception. When the filter is removed, no exception happens. This can be checked in the reproducer from @HerrDerb really easy by just commenting out the @RouterFilter line. In my opinion, the behavior should be the same with and without an empty vertx filter and I tend to the plain behavior currently happening without the filter.

It's interesting but not surprising. Because when a @RouterFilter is in place the CDI request context is bound to the Vert.x HTTP request (and is reused by the ContainerRequestFilter). If it's not present then it's bound to the ContainerRequestFilter (and created later in the chain).

geoand commented 6 months ago

I do understand your expectations but I think that your use case worked just by accident.

I agree with this

HerrDerb commented 6 months ago

With this reasoning, following thoughts occur to me:

manovotn commented 6 months ago

I tried debugging it some more to better see what's going on. I still think the change made in https://github.com/quarkusio/quarkus/pull/40408 makes sense as before we'd spin up a different req. context forcibly which potentially overrides any context state that already existed. IIRC, that's exactly the problem that we identified in https://github.com/quarkusio/quarkus/issues/40307 (CC @michalvavrik). However, the change also implies that whoever created that context is now responsible for its destruction which seems to be the culprit here.

I cannot distinct anymore about incorrect scope handling or expected occurrence.

I need to expect this exception every time I access a @RequestScoped bean. It turns it theoretically to a checked exception and I would need to wrap every call to a @RequestScoped bean with try catch.

You are expecting a @RequestScoped bean to work in a scenario where you know an HTTP req. will timeout. That's not a good use for request scoped bean to begin with. In the previous version, your use case run but does it actually work? Try to store some state in the bean and look at that from both filters and from the greetings service. You should see different states. Or just look at how many times you instantiate the RequestScopedBean within one request.

When not using a @RouterFilter and a request gets cancelled, the behaviour still seems incorrect as it will not produce a NoActiveContextException (see my reproducer). Currently, this is certainly an inconsistency we have.

I agree this does seem weird but I am not sure what we can do. It looks like context created for ContainerRequestFilter doesn't attempt to destroy it in case of timeout?

HerrDerb commented 6 months ago

However, the change also implies that whoever created that context is now responsible for its destruction which seems to be the culprit here.

Yes, and that the vert.x owned request context does ignore depending processes which are still running. (by ignoring I mean it does not wait nor cancel any depending running processes)

Well I don't really expect a timeout. The reproducer is just build like this to provoke the issue. A http request can be cancelled out of different reasons: timeout by the client, closing web site, cancle request with angular switchMap etc, etc. All this leads to the described issue. As backend, I expect the framework to handle stuff like this for me gracefully. This means, whenever I am in a still running process of a request I do expect to be able to access a @RequestScoped bean.

To clarify our productive use case: We receive a http request that runs through a @RouteFilter then a ContainerRequestFilter and runs a jpa request to the database to return a result. The mapping of the result requires knowledge of the current tenant context (request scoped bean). So it's rather basic. We do not expect this to throw an exception if the request gets cancelled by the client.

mschorsch commented 5 months ago

I am not sure if it is the same problem but we also get a ContextNotActiveException since Quarkus 3.10. Quarkus 3.9.5 works as expected.

Since Quarkus 3.10 we get the following exception (Note: The second error is a follow-up error of the ContextNotActiveException):

2024-06-01 15:59:51,328 ERROR [com.example.fre.ser.ExceptionMappers] (vert.x-eventloop-thread-3) Es ist ein unbehandelter Fehler aufgetreten.: jakarta.enterprise.context.ContextNotActiveException: RequestScoped context was not active when trying to obtain a bean instance for a client proxy of CLASS bean [class=io.quarkus.vertx.http.runtime.CurrentVertxRequest, id=0_6n6EmChCiiDdd8HelptG_A0AE]
    - you can activate the request context for a specific method using the @ActivateRequestContext interceptor binding
    at io.quarkus.arc.impl.ClientProxies.notActive(ClientProxies.java:70)
    at io.quarkus.arc.impl.ClientProxies.getSingleContextDelegate(ClientProxies.java:30)
    at io.quarkus.vertx.http.runtime.CurrentVertxRequest_ClientProxy.arc$delegate(Unknown Source)
    at io.quarkus.vertx.http.runtime.CurrentVertxRequest_ClientProxy.setCurrent(Unknown Source)
    at io.quarkus.resteasy.reactive.server.runtime.QuarkusCurrentRequest.set(QuarkusCurrentRequest.java:33)
    at org.jboss.resteasy.reactive.server.core.CurrentRequestManager.set(CurrentRequestManager.java:12)
    at org.jboss.resteasy.reactive.server.core.ResteasyReactiveRequestContext.handleRequestScopeActivation(ResteasyReactiveRequestContext.java:654)
    at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.handleRequestScopeActivation(QuarkusResteasyReactiveRequestContext.java:39)
    at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.requireCDIRequestScope(AbstractResteasyReactiveContext.java:264)
    at org.jboss.resteasy.reactive.server.handlers.ResponseWriterHandler.handle(ResponseWriterHandler.java:27)
    at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:147)
    at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147)
    at org.jboss.resteasy.reactive.server.vertx.VertxResteasyReactiveRequestContext$1$1.handle(VertxResteasyReactiveRequestContext.java:78)
    at org.jboss.resteasy.reactive.server.vertx.VertxResteasyReactiveRequestContext$1$1.handle(VertxResteasyReactiveRequestContext.java:75)
    at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:279)
    at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:261)
    at io.vertx.core.impl.ContextInternal.lambda$runOnContext$0(ContextInternal.java:59)
    at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:1583)

2024-06-01 15:59:51,338 ERROR [io.qua.ver.htt.run.QuarkusErrorHandler] (vert.x-eventloop-thread-3) HTTP Request to /api/karte/1092/netz_10_537_350.mvt failed, error id: e00d78cb-73c0-4f58-a8af-f23c5c41831a-1: java.lang.ClassCastException: class com.example.app.api.error.ApiUnexpectedError cannot be cast to class [B (com.example.app.api.error.ApiUnexpectedError is in unnamed module of loader io.quarkus.bootstrap.runner.RunnerClassLoader @3581c5f3; [B is in module java.base of loader 'bootstrap')
    at org.jboss.resteasy.reactive.server.providers.serialisers.ServerByteArrayMessageBodyHandler.writeResponse(ServerByteArrayMessageBodyHandler.java:16)
    at org.jboss.resteasy.reactive.server.core.ServerSerialisers.invokeWriter(ServerSerialisers.java:217)
    at org.jboss.resteasy.reactive.server.core.ServerSerialisers.invokeWriter(ServerSerialisers.java:185)
    at org.jboss.resteasy.reactive.server.core.serialization.FixedEntityWriter.write(FixedEntityWriter.java:28)
    at org.jboss.resteasy.reactive.server.handlers.ResponseWriterHandler.handle(ResponseWriterHandler.java:34)
    at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:147)
    at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147)
    at org.jboss.resteasy.reactive.server.vertx.VertxResteasyReactiveRequestContext$1$1.handle(VertxResteasyReactiveRequestContext.java:78)
    at org.jboss.resteasy.reactive.server.vertx.VertxResteasyReactiveRequestContext$1$1.handle(VertxResteasyReactiveRequestContext.java:75)
    at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:279)
    at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:261)
    at io.vertx.core.impl.ContextInternal.lambda$runOnContext$0(ContextInternal.java:59)
    at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:1583)

We do not explicity use RequestScoped, RouteFilter or ContainerRequestFilter.

I am pretty sure that there is a bug in Quarkus because the endpoint is quite simple. The endpoint only generates vector tiles for display on a map, it does not access a database or an external service, but there are a lot of requests.

I have not yet managed to create a reproducer.

HerrDerb commented 5 months ago

@mschorsch can you specify the exact version you are using? The issue discussed here requires exactly 3.10.1.

mkouba commented 5 months ago

With this reasoning, following thoughts occur to me:

When not using a @RouterFilter and a request gets cancelled, the behaviour still seems incorrect as it will not produce a NoActiveContextException (see my reproducer). Currently, this is certainly an inconsistency we have.

Well, it's not that surprising since you mix multiple technologies here. The @RouterFilter comes from the reactive-routes extension and ContainerRequestFilter comes from quarkus-rest. These technologies have a slightly different lifecycle.

Now if you really need a "plain" filter that controls the CDI request context activation, you can use the io.quarkus.vertx.http.runtime.filters.Filters class that comes from the quarkus-vertx-http extension, i.e. something like:

public void init(@Observes Filters filters, RequestScopedBean requestScopedBean, RequestContextController controller) {
     filters.register(rc -> {
         try {
            controller.activate();
            requestScopedBean.doSomething();
         } finally {
            controller.deactivate();
         }
         rc.next();
     }, 10); }
 }

But obviously you'll have two separate request contexts activated during a specific HTTP request. Which is IMO not a good idea.

I cannot distinct anymore about incorrect scope handling or expected occurrence.

I need to expect this exception every time I access a @RequestScoped bean. It turns it theoretically to a checked exception and I would need to wrap every call to a @RequestScoped bean with try catch.

HerrDerb commented 5 months ago

As I understand quarkus-rest is built on quarkus-vertx-http, so I would have expected they work together.

A odd discovery we just made with your input by not mixing RouteFilter and ContainerRequestFilter: We deleted and/or modified the ContainerRequestFilter from my reproducer. There were indeed different outcomes. Oddly enough, context reuse only happend in one case, which is when a VertX @RouteFilter and a ContainterRequestFilter with @PreMatching as available. Which is our case.

Here are the logs of the different scenarios generated with my reproducer, feel free to try yourself. The uuid of the RequestScopedBean is per instance, hinting thecurrent managed context.

As it is, with VertX RouteFilter and ContainerRequestFilter(witht @PreMatching): Context gets reused -> Test failes Slightly race condition biased, maybe give it another go if it does not fail on the first run.

2024-06-03 14:52:51,007 INFO  [org.acm.VertxTenantRequestFilter] (vert.x-eventloop-thread-0) Invoking VertxTenantRequestFilter
2024-06-03 14:52:51,007 INFO  [org.acm.RequestScopedBean] (vert.x-eventloop-thread-0) Doing something in RequestScopedBean 5e416106-a282-4579-8aee-fe1dbed33955
2024-06-03 14:52:51,011 INFO  [org.acm.RequestContextFilter] (vert.x-eventloop-thread-0) ContainerRequestFilter is invoked
2024-06-03 14:52:51,011 INFO  [org.acm.RequestScopedBean] (vert.x-eventloop-thread-0) Doing something in RequestScopedBean 5e416106-a282-4579-8aee-fe1dbed33955
2024-06-03 14:52:51,014 INFO  [org.acm.GreetingService] (executor-thread-1) Entering GreetingService.getGreeting
2024-06-03 14:52:51,015 INFO  [org.acm.RequestScopedBean] (executor-thread-1) Doing something in RequestScopedBean 5e416106-a282-4579-8aee-fe1dbed33955
2024-06-03 14:52:51,066 INFO  [org.acm.GreetingServiceTest] (main) Request timed out and therefor canceled
2024-06-03 14:52:51,076 INFO  [org.acm.GreetingService] (executor-thread-1) Finished waiting in GreetingService.getGreeting
2024-06-03 14:52:51,079 ERROR [io.qua.ver.htt.run.QuarkusErrorHandler] (executor-thread-1) HTTP Request to /api/greeting failed, error id: 52f66b6f-c4bd-4bb6-b1c7-8fcfefbccc25-1: jakarta.enterprise.context.ContextNotActiveException: RequestScoped context was not active when trying to obtain a bean instance for a client proxy of CLASS bean [class=org.acme.RequestScopedBean, id=Ce8P_Ak6Lxcg_dzCBwL19qf0jzA]
    - you can activate the request context for a specific method using the @ActivateRequestContext interceptor binding

With VertX RouteFilter and ContainerRequestFilter(without @PreMatching): Context is not reused -> Test succeeds

2024-06-03 14:53:42,560 INFO  [org.acm.VertxTenantRequestFilter] (vert.x-eventloop-thread-1) Invoking VertxTenantRequestFilter
2024-06-03 14:53:42,560 INFO  [org.acm.RequestScopedBean] (vert.x-eventloop-thread-1) Doing something in RequestScopedBean 3638b720-08c8-4d6c-a75a-6b26fe2af15f
2024-06-03 14:53:42,565 INFO  [org.acm.RequestContextFilter] (executor-thread-1) ContainerRequestFilter is invoked
2024-06-03 14:53:42,565 INFO  [org.acm.RequestScopedBean] (executor-thread-1) Doing something in RequestScopedBean cd097303-f6c6-449c-955b-94012124ec1d
2024-06-03 14:53:42,568 INFO  [org.acm.GreetingService] (executor-thread-1) Entering GreetingService.getGreeting
2024-06-03 14:53:42,568 INFO  [org.acm.RequestScopedBean] (executor-thread-1) Doing something in RequestScopedBean cd097303-f6c6-449c-955b-94012124ec1d
2024-06-03 14:53:42,619 INFO  [org.acm.GreetingServiceTest] (main) Request timed out and therefor canceled
2024-06-03 14:53:42,646 INFO  [org.acm.GreetingService] (executor-thread-1) Finished waiting in GreetingService.getGreeting
2024-06-03 14:53:42,646 INFO  [org.acm.RequestScopedBean] (executor-thread-1) Doing something in RequestScopedBean cd097303-f6c6-449c-955b-94012124ec1d

Only With VertX RouteFilter: Context is not reused -> Test succeeds

2024-06-03 14:54:40,762 INFO  [org.acm.VertxTenantRequestFilter] (vert.x-eventloop-thread-1) Invoking VertxTenantRequestFilter
2024-06-03 14:54:40,762 INFO  [org.acm.RequestScopedBean] (vert.x-eventloop-thread-1) Doing something in RequestScopedBean cfa7bf55-0487-491f-a977-2f40051ed839
2024-06-03 14:54:40,769 INFO  [org.acm.GreetingService] (executor-thread-1) Entering GreetingService.getGreeting
2024-06-03 14:54:40,769 INFO  [org.acm.RequestScopedBean] (executor-thread-1) Doing something in RequestScopedBean a1a316ae-d783-4e5c-ac38-966a2d2a26e9
2024-06-03 14:54:40,811 INFO  [org.acm.GreetingServiceTest] (main) Request timed out and therefor canceled
2024-06-03 14:54:40,846 INFO  [org.acm.GreetingService] (executor-thread-1) Finished waiting in GreetingService.getGreeting
2024-06-03 14:54:40,846 INFO  [org.acm.RequestScopedBean] (executor-thread-1) Doing something in RequestScopedBean a1a316ae-d783-4e5c-ac38-966a2d2a26e9
mkouba commented 5 months ago

As I understand quarkus-rest is built on quarkus-vertx-http, so I would have expected they work together.

Keep in mind that @RouteFilter comes from quarkus-reactive-routes and not from quarkus-vertx-http.

mschorsch commented 5 months ago

@mschorsch can you specify the exact version you are using? The issue discussed here requires exactly 3.10.1.

Quarkus 3.11 but after reading the new comments i am pretty sure that my issue is different from this one. I will open a new Issue when i have a reproducer.

HerrDerb commented 5 months ago

@mkouba I wasn't aware that combining those two is an issue, thanks. With a plain VertX filter it works as expected and it is also one dependency less. Thanks a lot for your effort👍🙂

Final thoughts: While this solves the issue for us, an inconsistency or/and confusion remains regarding the usage of @RouteFilter and ContainerRequestFilter. Currently, there's no documentation or indication suggesting why @RouteFilter and ContainerRequestFilter should not be used in combination, which seems somewhat counterintuitive. According to the available documentation, it appears that @RouteFilter can be used for this purpose. However, the system does not function as expected nor consistent when both are used and does not give you any hint. This leads to confusion or unexpected behaviour and might point to a general issue handling contexts between quarkus-rest and VertX. Obviously it is your choice to investigate further or not.