spring-projects / spring-framework

Spring Framework
https://spring.io/projects/spring-framework
Apache License 2.0
56.37k stars 38.04k forks source link

ServerHttpObservationFilter does not register against new async operations #33451

Closed hanson76 closed 1 month ago

hanson76 commented 1 month ago

Affects: Spring Framework 6.1.9 - current

The change introduced to ServerHttpObservationFilter in 6.1.9 causes requests that executes multiple asynchronous cycles to not get any observations and the observation will never be closed, the listener will not receive any callbacks.

The listener in ServerHttpObservationFilter need to add itself to the AsyncContext provided in the AsyncEvent provide when onStartAsync(AsyncEvent event) is being called. The listener will not receive any other callbacks unless this is done.

This happens if there is a servlet filter with lower priority in the filter chain that do asynchronous work (calls startAsync() on the request) before dispatching the request to further filters or Spring handlers, that in turn calls startAsync() on the request.

bclozel commented 1 month ago

This sounds very specific. We're going to need a minimal sample application to reproduce the case and properly test it.

hanson76 commented 1 month ago

We did some further investigation while creating a minimal Spring Boot demo application showing this error. We can reproduce it with Spring Boot 3.3.1 but not with Spring Boot 3.3.2 or 3.3.3 We have no other dependencies.

The difference is that ServerHttpObservationFilter.doFilterInternal is not called after dispatch() with Spring Boot 3.3.1, but is with 3.3.2 and 3.3.3. Not sure if this is the correct behavour yet,

The first call will create an observation that is never "stopped", the listener added here will not be called. The second call to doFilterInternal creates a new observation and the listener added is being called correctly.

Attached is a small zip file containing a very simple example. We verify it by calling the only endpoint exposed, and then calling actuator/metrics/http.server.requests afterwards.

time curl http://localhost:8080/ : should take ~2seconds (2 times 1 second async delays)

curl http://localhost:8080/actuator/metrics/http.server.requests|jq should output json document

The outcome when this works is that we get json data from the actuator endpoint. The outcome when this is not working is that we do not get any data from the actuator endpoint.

demo.zip

bclozel commented 1 month ago

Project seems empty with no controller endpoint nor custom filter. Am I missing something?

hanson76 commented 1 month ago

THe zip file indeed look to be broken! attached a new one.

The application.properties is bit different, ran it on a custom port 10000 locally not 8080

demo.zip

bclozel commented 1 month ago

Thanks for the update.

public class AsyncFilter extends OncePerRequestFilter {

  private final TaskScheduler scheduler;

  public AsyncFilter(TaskScheduler scheduler) {
    this.scheduler = scheduler;
  }

  @Override
  protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response, FilterChain filterChain) {

    final AsyncContext asyncContext = request.startAsync(request, response);

    scheduler.schedule(() -> later(asyncContext), Instant.now().plusSeconds(1));
  }

  private void later(AsyncContext asyncContext) {
    asyncContext.dispatch();
  }
}

I know that coming up with a minimal sample can be hard, but here the Async filter looks invalid for several reasons:

  1. the doFilterInternal never calls the filter chain so filters are not executed or not in the order they should
  2. the scheduled work calls asyncContext.dispatch(); which... dispatches the exchange back to the container but the async context is never completed. Maybe explain what is the goal of this filter?
  3. it seems the filter is competing with the actual controller for the response. Trying to fix one of the problems listed above results in exceptions thrown by the Servlet container or an invalid HTTP response.
hanson76 commented 1 month ago

Our understanding is that this is the way you have to write filter classes where you want to do async work that has to happen before next filter in the chain. The actual filter that we have is doing a async / non-blocking HTTP request to another system, and taking action based on the result.

The "later" method actually does .complete() or dispatch() depending on the outcome of that request (in our real production code).

The OncePerRequestFilterwill do the filterChain.doFilter(request, response); call when dispatch happens, based on that the filter already has been called once already. That is why we do not need to call it from within the doFilterInternal method.

bclozel commented 1 month ago

Thanks, I understand better now.

I do see that our StandardServletAsyncWebRequest does not detect that async has already been started, as it should. Since we dispatched back into the container, I think this is expected.

I also see that adding a listener in the custom filter is also failing, because completion is never called:

final AsyncContext asyncContext = request.startAsync(request, response);
asyncContext.addListener(new AsyncListener() {

      @Override
      public void onComplete(AsyncEvent event) throws IOException {
        // this is never called
        System.out.println("AsyncFilter onComplete");
      }

      @Override
      public void onTimeout(AsyncEvent event) throws IOException {

      }

      @Override
      public void onError(AsyncEvent event) throws IOException {
        System.out.println("AsyncFilter onError");
      }

      @Override
      public void onStartAsync(AsyncEvent event) throws IOException {
        System.out.println("AsyncFilter onStartAsync");
      }
    });

This only happens when we have a Spring controller returning a CompletableFuture or a DeferredResult. If we call a synchronous controller, there is no issue with the custom async listener nor the observation filter.

I think that we might need to call asyncContext.complete() from the StandardServletAsyncWebRequest to properly call all registered async listeners.

I'm renaming this issue to reflect that, as I don't think there is a bug in the ServerHttpObservationFilter.

hanson76 commented 1 month ago

One thing that I do not understand, is the different behavior between Spring Boot 3.3.1 and 3.3.(2|3) . Changing 3.3.1 to 3.3.2 in the pom.xml file will change the outcome for some reason that I have not yet understood.

hanson76 commented 1 month ago

The documentation on AsyncListener onStartAsync(AsyncEvent event) differ if you look at the javadoc for the class in tomcat-embed-core that is used by Spring Boot and the documentation from the Jakarta EE.

The Jakarta documentation states:

This AsyncListener will not receive any events related to the new asynchronous cycle unless it registers itself (via a call to AsyncContext.addListener(jakarta.servlet.AsyncListener)) with the AsyncContext that is delivered as part of the given AsyncEvent.

That is why we think this was the problem. We do have couple of more filters we use in the filter chain that do add themself on calls to onStartAsync(AsyncEvent event) , they all work as expected.

hanson76 commented 1 month ago

Found out why there is a different behavior between Spring Boot 3.3.1 and 3.3.2, It is this change done in Spring Framework 6.1.11 that make it look like it works.

Spring Framework 6.1.10 causes the original observation created then the first execution of the ServerHttpObservationFilter to leak and not "stop" because the listener is removed in the second startAsync call and ServerHttpObservationFilter is not readding itself.

My guess is that the change above introduced in 6.1.11 tried to fix this, but the side effect is that there now is 2 Observations created in this demo app, the first one is lost and a new one is created on every filterChain pass during async handling. The last of them will have it's onComplete, onError or onTimeout method executed, while the other are leaked.

bclozel commented 1 month ago

Thanks for bearing with me.

I think this is due to the change we introduced in #32730 to avoid leaking observations when async requests are completed directly, without any dispatching back to the container.

the side effect is that there now is 2 Observations created in this demo app, the first one is lost and a new one is created on every filterChain pass during async handling.

This is inconsistent with what I'm seeing in your repro. The filter is only called once, and a single observation is created. Because there are several async starts in the same exchange, our observation async listener indeed needs to add itself to subsequent starts in onStartAsync(AsyncEvent event). I'm working on a fix for that for 6.1.13 and you should be able to test SNAPSHOTs as soon as it's in.

Is this consistent with your tests or am I missing something here?

hanson76 commented 1 month ago

Is this consistent with your tests or am I missing something here?

You are correct, I missed the call to createOrFetchObservation I did set a breakpoint on all on* methods of ObservationAsyncListener and noticed that the two different calls where for two different instances of ObservationAsyncListener for the same request.

It might be that using shouldNotFilterAsyncDispatch = false or re-adding the listener gives the same result in the end. Unless you have found another case that actually requires that the listener readds itself to the context when onStartAsync(AsyncEvent event) is called.

bclozel commented 1 month ago

@hanson76 I've pushed some changes and they should be available soon as 6.1.13-SNAPSHOT. Thanks a lot for your report and repro project.