spring-projects / spring-framework

Spring Framework
https://spring.io/projects/spring-framework
Apache License 2.0
55.18k stars 37.57k forks source link

Performance degradation in Web layer after upgrade to Spring 6 #31098

Closed alexey-anufriev closed 3 months ago

alexey-anufriev commented 8 months ago

Affects: 6.0.10 (Spring Boot 3.1.1)

Summary: After upgrade to Spring Boot 3.x application started to take more time for web requests processing.

Description: Initially, the application was based on Spring Boot 2.7.14 (Java 17). But since we upgraded to Spring Boot 3.1.1 we noticed performance degradation in web layer. Requests processing took extra time in comparison with Boot 2.x. Upgrade to Spring Boot 3.1.2 made it ever worse. Tomcat's MessageBytes serialization started to perform even slower. So we reverted to Spring Boot 3.1.1.

One more thing that was discovered in the profiler is slow performance of AntMatcher. So we followed the recommendation from upgrade guide of Spring Boot 2.6 and witched to PathParser. As it turns out, this did not make much of an improvement (see details below).

CPU

Here are some details captured using DataDog profiler.

Here is a request processing of the URL that has path variables: image

And here is a request processing of the URL that has no path variables: image

Memory

One more thing that was discovered is memory consumption, most of the allocations are happening here: image

This is the size per minute image


How can this be improved? One thought is around some kind of caching for PathParser to avoid extra calculations while request matching.

bclozel commented 8 months ago

Thanks for the report @alexey-anufriev , but there not so much actionable here, or at least conflicting messages.

You're first mentioning Tomcat's MessageBytes performance, which is out of our control. You are also pointing to AntPathMatcher and PathPattern as areas of improvements. At this point, I'm not sure where the performance loss comes from and whether this is related to those classes. Did you compare the profiling output of 2.7.x and 3.1.x for your application? Did the ratio of the request matching phase grow in the overall execution with this upgrade?

Again, without much context, it's hard to understand why this part of the execution is slow. Request matching is of course an important part and we've optimized it with PathPatternParser and PathPattern. Depending on the request and if the path matches, some other conditions are processed (content negotiation, request parameters and others). It's not surprising some requests require less processing for the matching phase as optimizations happen along the way.

Discussing about performance is often tricky as it's easy to waste time setting up an imperfect benchmark test harness and draw the wrong conclusions. Comparing the same application with two different versions and pointing to the difference should be a good first step. Pointing to parts of the request processing and asking for performance improvements is not a good way to start; we've got multiple micro-benchmarks in the Framework codebase and spending time there is not always the most efficient way to optimize for your case.

How can this be improved? One thought is around some kind of caching for PathParser to avoid extra calculations while request matching.

The PathPatternParser is only called once during startup and create efficient PathPattern instances for matching against incoming requests. We optimized this quite a bit already in the past using a JMH benchmark. I don't understand the suggestion here. What should be cached?

alexey-anufriev commented 8 months ago

Fair enough, let me prepare benchmarks.

bclozel commented 8 months ago

@alexey-anufriev I'm not suggesting you to prepare benchmarks, quite the opposite. It takes a lot of time to design and implement a benchmark and consider its biases.

Here, I'm really requesting a full picture of the CPU time (your datadog flame graph) for the 2.7.x and 3.1.x versions of the same application. Comparing CPU time there should point to possible regressions that I could look into.

alexey-anufriev commented 8 months ago

Ok, let me try to do fast comparison.

This is before the upgrade: image

And this is after: image

bclozel commented 8 months ago

Thanks for the update!

If I understand correctly, both applications are Spring MVC, one using the AntPathMatcher and the other the PathPatternParser variant. It would be nice to check whether the same behavior can be seen in Spring Boot 2.7. By default there, the PathPatternParser strategy is chosen as well. Did you opt-out explicitly in 2.7.x? If so, can you try with the same matching strategy in 2.7.x?

it looks like the MvcRequestMatcher is triggering the matching operation here, as you've probably migrated your Security configuration as requested by Spring Security. Still, the matching phase should not take longer than usual so this bit is interesting.

It would be interesting to get more information about the particular endpoints involved here:

With that information I can start building a sample application involving Spring Security, a few controller endpoints and find out if there is a performance issue with the matching phase.

alexey-anufriev commented 8 months ago
  1. Yes, we explicitly opted out and kept AntMatcher in 2.7.
  2. Sample APIs are the following:
    
    GET /api/org/{orgId}/usr/{userId}/membership
    GET /api/org/toggle/{toggleKey}/enabled
    GET /api/org/toggles
    GET /api/orgs
    GET /api/orgs/{id}

POST /api/auth/key POST /api/auth/token


3. As you see, these are mostly GET requests, so the content of those is pretty common: auth header, maybe CSP, and so on. In general, not that much of the content.
4. You are right, Security confguration has been migrated according to the upgrade guide.
5. I will also provide the behaviour on 2.7 with PathParser.
alexey-anufriev commented 8 months ago

And this is how 2.7 works with PathParser configured

image

alexey-anufriev commented 8 months ago

One more interesting observation that I mentioned before: Tomcat regression.

So here we have Spring boot 3.1.1


And here we have Spring boot 3.1.2

alexey-anufriev commented 8 months ago

The fastest combination seems to be the following:

image

alexey-anufriev commented 8 months ago

hi @bclozel, are there any updates here? or maybe some additional information required?

bclozel commented 8 months ago

I'm still working on it. So far nothing conclusive. I do see that the mvc request matcher is taking longer, which is not unexpected since it's doing more in the first place. The number of req/s is still comparable, and the ratio of time taken by request matching is still nowhere near where your samples are.

I've profiled locally with Yourkit and wrk on a sample app that should be comparable.

With ant matcher:

antrequestmatcher

With mvc request matcher:

mvcrequestmatcher

I'll continue investigating this in the next days and keep you posted. I might share the sample with you at some point or write JMH benchmarks to see if I can optimize a few spikes I'm seeing.

alexey-anufriev commented 8 months ago

Thank you for the update. Looking forward to the next steps.

smotron commented 6 months ago

I can share some observations... We have a time-critical web application including Spring Security. Updating Spring Boot 3.0.5 to 3.1.2 increased our average execution time from ~10ms to 12ms. Nothing was changed beside the version update. The additional time is added somewhere before entering the controller.

akhilbrao commented 6 months ago

Hi @bclozel - Is there any suggestion for this observation. @alexey-anufriev the tomcat latency around toString upon checking their repo there isn’t any updates made on that code. Can you share how did you mitigate this issue.

bclozel commented 6 months ago

@smotron @akhilbrao I'm afraid this issue does not intend to cover all performance aspects of Spring web frameworks with Security. We are still investigating (although I had to prioritize 6.1 work as the release is approaching) and we'll share our findings here.

Our initial findings do not point to a particular place nor show a huge difference in latency nor throughput. We might find some optimization opportunities along the way but as you can see Spring is only involved in some parts of the flame charts. Spring Boot upgrades usually involve lots of other components as well.

akhilbrao commented 6 months ago

@bclozel thanks for the update. We have recently upgraded to spring boot 3.1.2 lets see how the performance would be. Though we don’t have many url patterns. @alexey-anufriev when you find some time let us know the solution for tomcat if you had done anything.

alexey-anufriev commented 6 months ago

What we have done so far to mitigate this problem is use of ant-matchers in every place. But we did not do anything particular for the Tomcat. And we have the recent Boot, thus I suspect Tomcat is also newest and might be having some fixes for this as well.

akhilbrao commented 6 months ago

@alexey-anufriev thanks for sharing the info. So, you are using boot 3.1.2?

alexey-anufriev commented 6 months ago

@akhilbrao, actually 3.1.5

bclozel commented 5 months ago

I had a deeper look to this issue. Thanks for your patience.

Here, Spring Security uses a Spring MVC mechanism, the HandlerMappingIntrospector to look up the matching handler and check authorizations. It's critical to use the same matching algorithm in both places (the security layer and the web framework layer) to avoid potential attacks. The problem is, this matching is more costly than a pattern match and this should be done once per request. Here, I'm seeing that part of the process is done multiple times and this is inefficient. This is even worse if there are multiple matcher configurations.

We've discussed this with the Security team and as a first step, we're introducing a caching mechanism, see #31588. Once that new filter is introduced, it should be configured by Spring Security in its own filter chain. I'll ping you here when SNAPSHOTs are available.

From that point, we will consider the following steps:

  1. it would be interesting to get new numbers from your side (of course, I'll profile things on my end too) and validate that this change improves things
  2. I can run micro-benchmarks to possibly improve PathPattern matching algorithms, especially the "match and extract" phase for URI patterns with path variables. This part has been implemented with efficiency in mind and performs better than AntPathMatcher, but we can always zoom in and check.
  3. We already have a plan to phase out this HandlerMappingIntrospector in Spring Framework 6.2, and we'll use the PathPatternParser directly because there won't be any chance of disconnect between Spring Security and Spring MVC. This is already the case for Spring WebFlux, because WebFlux never had AntPathMatcher support in the first place.
janickr commented 5 months ago

@alexey-anufriev We have the exact same issue, after some digging I found that the MessageBytes.toString was a Tomcat regression, and that it will be fixed in 11.0.0-M1, 10.1.16, 9.0.83 and 8.5.96 (see https://bz.apache.org/bugzilla/show_bug.cgi?id=68026 )

alexey-anufriev commented 5 months ago

@bclozel, thank you for the update! Happy to assist with further experiments.

@janickr, yes, this we also noticed, unfortunately 10.1.16 has just been recently released, but hope it will be available in the next Spring Boot.

bclozel commented 4 months ago

https://github.com/spring-projects/spring-security/issues/14128 is now fixed and available in Spring Security 6.1.6-SNAPSHOT and 6.2.1-SNAPSHOT. Please test your applications with the current Spring Boot 3.2.1-SNAPSHOT version to check the performance changes. Release is scheduled next week for Spring Boot.

alexey-anufriev commented 4 months ago

@bclozel, thank you for your effort, I will test the new build and will come back with the results soon.

spring-projects-issues commented 4 months ago

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

spring-projects-issues commented 3 months ago

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open the issue.

alexey-anufriev commented 3 months ago

It took a bit of time to validate the fix completely (also considering NY holidays), but we are done now, and I can confirm, that we not only got rid of performance degradation, but also got performance boost with path-parser vs ant-matcher.

@bclozel, and the rest of spring team, thank you for your effort and help!

bclozel commented 3 months ago

@alexey-anufriev Awesome, thanks for letting us know!

SoundaryaDassC commented 3 months ago

@bclozel Thanks for considering the issue and getting it fixed. @alexey-anufriev just wanted to confirm ant-matcher performance is better.

tushar-meesho commented 1 week ago

@bclozel @alexey-anufriev we have recently planning to migrate to spring-boot-3.0.1 version from spring-boot-2.7.10, Do we have to upgrade to Spring Boot 3.2.1-SNAPSHOT or later, so that we don't face any latency issues on production because of this caching isssue. Is this fix live on release version spring-boot - 3.2.5 ?

alexey-anufriev commented 1 week ago

hi @tushar-meesho, yes, Spring Boot 3.2.1 contains a fix (note that normally you should not use a SNAPSHOT version, it is just for testing). Also, 3.2.5 has it included, so don't see a reason to not upgrade to the latest one.