spring-projects / spring-boot

Spring Boot
https://spring.io/projects/spring-boot
Apache License 2.0
74.39k stars 40.51k forks source link

Optimize startup time: PropertySourcesPropertyResolver #7572

Closed dsyer closed 6 years ago

dsyer commented 7 years ago

There seems to be an opportunity to cache or otherwise improve the performance of PropertySourcesPropertyResolver (see https://github.com/dsyer/spring-boot-startup-bench/blob/master/startup.adoc).

philwebb commented 7 years ago

I think we should wait until 2.0 for this one since we want to revisit many aspects of property resolution.

tkvangorder commented 6 years ago

We have a large, monolithic Spring Boot/Cloud application that I recently upgraded to 1.5.7 and that application is taking upwards of 57 seconds to startup. I decided to run our application through Java Mission Control and generated a flight recording via starting the applications with these additional arguments:

-XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:StartFlightRecording=delay=1s,duration=120s,name=myrecording,filename=/tmp/flightrecording.jfr,settings=profile

I know this is not a perfect technique, as it is using profile sampling, but one class floats to the top as being the "hottest" method during startup: DefaultPropertyNamePatternsMatcher.matches().

Accounting for the 29.85% lost in Character.toLowerCase(int) and the 12.43% in matches(), that is roughly 24 seconds of our total time. I have attached a screen shot of the flight recording and I am hoping we might see some relief in the 1.5.x branch (as well as 2.0.x)?

screen shot 2017-09-27 at 3 30 49 pm

wilkinsona commented 6 years ago

@tkvangorder Thanks for sharing your findings. The data is both interesting and surprising. Property binding as changed considerably in 2.0 so this may have already been resolved in master. I realise this may be no small undertaking given the size of your app, but it would be useful to know how your app's startup performs with Boot 2.0. The latest Spring Cloud Finchley milestone is compatible with Boot 2.0 M3 if you have the time to collect some more data.

tkvangorder commented 6 years ago

OK, I have gone back and re-run the application several times to try to normalize my results, they are all generally the same on my machine (MacBook Pro 2.8GHz, Intel Core i7, 16GB).

Running the application on Spring Boot 1.4.6 & Camden SR7:

Started WebServiceApplication in 43.212 seconds (JVM running for 49.599) Character.toLowerCase() == 28.79% DefaultPropertyNamePatternsMatcher.matches() == 13.86%

boot146camdensr7

Running the application on Spring Boot 1.5.7 & Dalston SR3:

Started WebServiceApplication in 48.624 seconds (JVM running for 55.204) Character.toLowerCase() == 28.20% DefaultPropertyNamePatternsMatcher.matches() == 12.94%

boot157dalstonsr3

I am afraid it will likely be several months before we move up to Spring Boot 2.0. I am hoping for some guidance on how we might reduce our startup time on the 1.5.x branch?

dsyer commented 6 years ago

I actually doubt the property resolution is really hitting you that hard, and 2.0 isn't any faster in my experience (nor is it really much slower). Profiling startup is pretty difficult, and I never reach the same conclusion if I repeatedly to run and analyse an app that way. However, I have found just measuring the startup time carefully to be useful, especially comparing that to other apps of the same size and weight (number of beans or classes loaded). I would be more than happy to try your app if you can make it available. If not, then please run the benchmarks linked above (especially the "static" ones: https://github.com/dsyer/spring-boot-startup-bench/blob/master/static/README.adoc) on your hardware and then run your app in the same test harness and let's look at the difference. If you can't even do that, at least use this listener (https://github.com/dsyer/spring-boot-startup-bench/blob/master/static/src/main/java/com/example/config/BeanCountingApplicationListener.java) and report back the data it creates along with the startup time.

tkvangorder commented 6 years ago

@dsyer I have to do this work outside my "normal" job, sorry for the delay.

Unfortunately, I cannot give you a copy of our application.

Our monolithic application :

1) I did put the bean counting listener into our application:

2017-10-01 08:53:33.857  INFO 25073 --- [           main] c.b.s.BeanCountingApplicationListener    : Bean count: build-webservices:localdev:8080=1129
2017-10-01 08:53:33.858  INFO 25073 --- [           main] c.b.s.BeanCountingApplicationListener    : Class count: build-webservices:localdev:8080=21858

2) I cloned your startup benchmarking project (very cool btw). I was able to get partial benchmarks from the static sub-module:

MainBenchmark.main      empt    ss   10  0.765 ± 0.023   s/op
MainBenchmark.main      demo    ss   10  1.569 ± 0.054   s/op
MainBenchmark.main      actr    ss   10  2.140 ± 0.530   s/op
MainBenchmark.main      jdbc    ss   10  1.827 ± 0.086   s/op
MainBenchmark.main      actj    ss   10  2.204 ± 0.061   s/op
MainBenchmark.main      jpae    ss   10  2.812 ± 0.515   s/op
MainBenchmark.main      conf    ss   10  2.459 ± 0.506   s/op
MainBenchmark.main      erka    ss   10  2.744 ± 0.140   s/op
MainBenchmark.main      busr    ss   10  3.486 ± 0.021   s/op
MainBenchmark.main      zuul    ss   10  6.067 ± 0.525   s/op
MainBenchmark.main      erkb    ss   10  4.370 ± 0.521   s/op
MainBenchmark.main      slth    ss   10  7.193 ± 0.693   s/op
StripBenchmark.strip      thin    ss   10  1.257 ± 0.011   s/op
StripBenchmark.strip      slim    ss   10  1.513 ± 0.056   s/op
StripBenchmark.strip      lite    ss   10  1.215 ± 0.500   s/op
StripBenchmark.strip      func    ss   10  1.008 ± 0.065   s/op
SnapBenchmark.snap      empt  avgt   10  0.760 ± 0.011   s/op
SnapBenchmark.snap      demo  avgt   10  1.608 ± 0.022   s/op
SnapBenchmark.snap      actr  avgt   10  2.132 ± 0.505   s/op
SnapBenchmark.snap      jdbc  avgt   10  1.886 ± 0.068   s/op
SnapBenchmark.snap      actj  avgt   10  2.300 ± 0.046   s/op
SnapBenchmark.snap      jpae  avgt   10  2.876 ± 0.514   s/op
SnapBenchmark.snap      conf  avgt   10  2.366 ± 0.055   s/op

I ran into a few problems building/running the benchmarks. (I will try to dive into these my next free weekend, just don't have the energy to research any further today). After some work, I managed to run the main benchmarks (a couple of the processes seemed to deadlock around the 95% complete mark, I just killed those forked processes, so those may skew the results:

Benchmark                                        Mode  Cnt   Score    Error  Units
ConfigServerBenchmark.devtoolsRestart            avgt   10   0.962 ±  0.070   s/op
ConfigServerBenchmark.explodedJarMain            avgt   10   3.061 ±  0.464   s/op
ConfigServerBenchmark.fatJar13x                  avgt   10   4.121 ±  0.516   s/op
ConfigServerBenchmark.fatJar14x                  avgt   10   5.701 ±  0.542   s/op
ConfigServerBenchmark.fatJar15x                  avgt   10   5.500 ±  0.485   s/op
JsaBenchmark.sharedClasses                       avgt   10   3.662 ±  0.077   s/op
JsaBenchmark.thinMain                            avgt   10   4.312 ±  0.460   s/op
MinimalBenchmark.explodedJarMain                 avgt   10   1.617 ±  0.040   s/op
MinimalBenchmark.fatJar                          avgt   10   3.412 ±  0.511   s/op
PetclinicBenchmark.devtoolsRestart               avgt   10   0.001 ±  0.001   s/op
PetclinicBenchmark.explodedJarMain               avgt   10   4.456 ±  0.558   s/op
PetclinicBenchmark.fatJar                        avgt   10  10.156 ±  0.657   s/op
PetclinicBenchmark.noverify                      avgt   10   8.902 ±  0.649   s/op
PetclinicLatestBenchmark.devtoolsRestart         avgt   10   1.405 ±  0.532   s/op
PetclinicLatestBenchmark.explodedJarMain         avgt   10   4.231 ±  0.078   s/op
PetclinicLatestBenchmark.fatJar                  avgt   10   9.891 ±  1.009   s/op
PetclinicLatestBenchmark.noverify                avgt   10   8.796 ±  0.690   s/op
ShadedBenchmark.explodedShadedMain               avgt   10   8.869 ±  0.964   s/op
ShadedBenchmark.shaded                           avgt   10   6.661 ±  0.435   s/op
SpringBoot13xBenchmark.explodedJarLauncher       avgt   10   3.080 ±  0.055   s/op
SpringBoot13xBenchmark.explodedJarMain           avgt   10   1.991 ±  0.505   s/op
SpringBoot13xBenchmark.fatJar                    avgt   10   3.022 ±  0.493   s/op
SpringBoot14xBenchmark.explodedJarLauncher       avgt   10   4.582 ±  0.552   s/op
SpringBoot14xBenchmark.explodedJarMain           avgt   10   2.105 ±  0.079   s/op
SpringBoot14xBenchmark.fatJar                    avgt   10   4.412 ±  0.495   s/op
SpringBoot15xBenchmark.explodedJarLauncher       avgt   10   4.251 ±  0.480   s/op
SpringBoot15xBenchmark.explodedJarMain           avgt   10   2.091 ±  0.490   s/op
SpringBoot15xBenchmark.fatJar                    avgt   10   4.073 ±  0.493   s/op
SpringBoot20xBenchmark.explodedJarLauncher       avgt   10   4.204 ±  0.520   s/op
SpringBoot20xBenchmark.explodedJarMain           avgt   10   1.985 ±  0.012   s/op
SpringBoot20xBenchmark.fatJar                    avgt   10   4.091 ±  0.496   s/op
SpringBootThinBenchmark.basic13xThin             avgt   10   2.915 ±  0.551   s/op
SpringBootThinBenchmark.basic14xThin             avgt   10   3.087 ±  0.173   s/op
SpringBootThinBenchmark.basic15xThin             avgt   10   3.072 ±  0.151   s/op
SpringBootThinBenchmark.petclinicPrecomputeThin  avgt   10   5.452 ±  0.126   s/op
SpringBootThinBenchmark.petclinicThin            avgt   10   5.539 ±  0.187   s/op

I am going to post an issue on the benchmarking project, as this forum doesn't seem like the correct place for those. https://github.com/dsyer/spring-boot-startup-bench/issues/4

Thanks

dsyer commented 6 years ago

Thanks, that's very interesting. It certainly does load a lot of classes on startup (more than twice the number of any of the benchmark samples). A really rough analysis, taking your benchmark data and extrapolating the classes loaded vs. startup time as a straight line, gives me an estimate of 18sec for starting your app. I'm not sure the line is really straight, so maybe the fact that you are off the line is not telling us anything. On the other hand it might be. Are you using Spring AOP, for instance (that was the problem with the sleuth sample, which we managed to fix in AspectJ)? The Netflix stack is also quite a big burden. You could probably remove that quickly, just to get a new measurement?

I wish we could get our hands on the actual code. Is there any chance you could write a fake version of your app and put it in github where we can measure it?

tkvangorder commented 6 years ago

I am afraid I cannot post a version of our application. I had a little time to follow some of the same methodology you were using in your testing harness. I ran the application at a baseline, then started removing things....circuit breaker, discovery client, etc.

My very un-scientific findings:

Baseline: 

2017-10-02 13:02:15.168  INFO 2214 --- [           main] c.b.s.BeanCountingApplicationListener    : Bean count: build-webservices:localdev:8080=1129
2017-10-02 13:02:15.168  INFO 2214 --- [           main] c.b.s.BeanCountingApplicationListener    : Class count: build-webservices:localdev:8080=21738

2017-10-02 12:37:35.699  INFO 863 ---  [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 36.71 seconds (JVM running for 38.256)
2017-10-02 12:38:35.274  INFO 919 ---  [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 37.506 seconds (JVM running for 39.037)
2017-10-02 12:39:38.863  INFO 971 ---  [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 39.275 seconds (JVM running for 40.819)
2017-10-02 12:46:48.278  INFO 1372 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 39.015 seconds (JVM running for 39.469)
2017-10-02 12:48:09.158  INFO 1444 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 38.233 seconds (JVM running for 39.772)

Without @EnableCircuitBreaker

2017-10-02 13:00:39.319  INFO 2139 --- [           main] c.b.s.BeanCountingApplicationListener    : Bean count: build-webservices:localdev:8080=1120
2017-10-02 13:00:39.320  INFO 2139 --- [           main] c.b.s.BeanCountingApplicationListener    : Class count: build-webservices:localdev:8080=21312

2017-10-02 12:34:37.000  INFO 697 ---  [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 32.302 seconds (JVM running for 32.759)
2017-10-02 12:41:20.735  INFO 1075 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 31.263 seconds (JVM running for 32.782)
2017-10-02 12:42:33.368  INFO 1139 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 31.488 seconds (JVM running for 32.946)
2017-10-02 12:43:31.180  INFO 1194 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 31.083 seconds (JVM running for 32.613)

Remove PostConstructs: (We had a lot of classes "doing things" in post construct, removed the @PostConstruct annotations to eliminate that noise.)

2017-10-02 13:42:32.809  INFO 5303 --- [           main] c.b.s.BeanCountingApplicationListener    : Bean count: build-webservices:localdev:8080=1120
2017-10-02 13:42:32.810  INFO 5303 --- [           main] c.b.s.BeanCountingApplicationListener    : Class count: build-webservices:localdev:8080=20639
2017-10-02 13:42:32.811  INFO 5303 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 28.044 seconds (JVM running for 28.462)
2017-10-02 13:41:40.860  INFO 5257 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 29.288 seconds (JVM running for 29.773)

Remove @EnableDiscoverClient (did not remove dependencies)

2017-10-02 15:47:19.402  INFO 11499 --- [           main] c.b.s.BeanCountingApplicationListener    : Bean count: build-webservices:localdev:8080=1095
2017-10-02 15:47:19.403  INFO 11499 --- [           main] c.b.s.BeanCountingApplicationListener    : Class count: build-webservices:localdev:8080=18990
2017-10-02 15:47:19.404  INFO 11499 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 26.84 seconds (JVM running for 27.256)
2017-10-03 07:31:22.873  INFO 14422 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 26.855 seconds (JVM running for 27.284)

Local config (no config server, just moved config from server into a local application.yml and then disabled the config client via boostrap property)

2017-10-03 08:28:07.893  INFO 17581 --- [           main] c.b.s.BeanCountingApplicationListener    : Bean count: build-webservices:localdev:8080=1094
2017-10-03 08:28:07.893  INFO 17581 --- [           main] c.b.s.BeanCountingApplicationListener    : Class count: build-webservices:localdev:8080=18873
2017-10-03 08:28:07.894  INFO 17581 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 25.069 seconds (JVM running for 26.615)
2017-10-03 08:30:17.268  INFO 17702 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 25.772 seconds (JVM running for 27.298)
2017-10-03 08:31:16.354  INFO 17759 --- [           main] c.b.webservice.WebServiceApplication     : Started WebServiceApplication in 25.439 seconds (JVM running for 26.952)

Getting closer to 18 seconds, given my profiling results from earlier.

dsyer commented 6 years ago

So @EnableCircuitBreaker and @PostConstruct had by far the biggest impact. I wonder if using AspectJ 1.8.11 would help with the former (I haven't measured it but it was a big performance release). And obviously, you should look at deferring or backgrounding the work in your initializers.

All interesting stuff, but not yet pointing any fingers at why you see different results than I do with your toy application.

tkvangorder commented 6 years ago

A small update on this, after upgrading to spring boot 1.5.9, the @EnabledCircuitBreaker is no longer a significant slow down in our code base. It looks like the changes in the AspectJ 1.8.13 may have played a roll in this fix. @dsyer I am at Spring One this week, while I can't give you the source code, we could sit down over my lap top. I suspect you have better things to do this week...but it's an option. 8)

dsyer commented 6 years ago

That's good to know. We definitely should find time to look at it while you are here.

dsyer commented 6 years ago

The discussion here has veered away from the topic of the original issue, so we should discuss it somewhere else (maybe in the benchmark repo). I think we can probably close this one (data from #10696 shows that 2.0 has already improved in this area).

wilkinsona commented 6 years ago

Thanks, @dsyer. Closing as suggested.