spring-projects / spring-framework

Spring Framework
https://spring.io/projects/spring-framework
Apache License 2.0
56.75k stars 38.15k forks source link

Parallel bean initialization during startup [SPR-8767] #13410

Closed spring-projects-issues closed 9 months ago

spring-projects-issues commented 13 years ago

Tomasz Nurkiewicz opened SPR-8767 and commented

Spring should provide a way (possibly a BeanFactory with a different ConfigurableListableBeanFactory#preInstantiateSingletons implementation) to initialize singleton non-lazy beans on startup in parallel using a thread pool. This could significantly reduce startup (and maybe shutdown) time by creating and initializing independent beans concurrently.

The algorithm is pretty simple in principle. Whereas the normal bean factory creates beans in single thread in rather random order, this implementation should:

  1. Find all bean definitions that don't have any unresolved dependencies.
  2. Schedule creation of each bean found in 1. in a separate concurrent task to allow parallel creation
  3. When any of the tasks scheduled in 2. is completed go to 1.

The algorithm stops when all beans are created.

Implementation notes:


Affects: 3.1 RC1

Reference URL: http://forum.springsource.org/showthread.php?105896-Initialize-spring-beans-in-parallel-at-startup

Issue Links:

80 votes, 77 watchers

spring-projects-issues commented 13 years ago

Chris Beams commented

Hi Tomasz,

Reading through the linked forum thread, Marten's suggestion hits the nail on the head

The problem here, imho, is that you are mixing bean construction and bean initialization.

The latter is something you (judging from your post) want to do in parallel. The easiest approach, I guess, is to create an ApplicationListener which listens to ContextRefreshedEvents. (This is fired when the context is up) and which starts initializing the caches, you could plugin a TaskExecutor for this which utilizes the servers threadpool, which in turn should utilize the underlying hardware to its fullest...

Best of both worlds imho...

(http://forum.springsource.org/showthread.php?105896-Initialize-spring-beans-in-parallel-at-startup&p=352702#post352702)

While it is conceptually straightforward to imagine the container introducing concurrent bean initialization, in practice it would be anything but. We would need to see quite a bit of feedback and demand that Spring container initialization is fundamentally too slow to seriously consider this kind of change. Again, given the specific scenario in the forum post, it appears that the user can make changes that would both solve the startup time problem and probably improve the design of his program in the process.

Feel free to comment further if you think there is a compelling case that's been missed here, and we can leave this open for further comments and votes to that effect. Otherwise I'll close as won't fix for now.

spring-projects-issues commented 13 years ago

Niklas Schlimm commented

Hi Chris, we're currently working on a migration of over 50 web applications to SpringIOC. Startup time is a big issue 'cause we need quick startup during development and (more important) in production environment. Since Spring does not support concurrent singleton bean instantiation we have a startup time of out complete production environment of ~2h 30 Minutes. which is not acceptable to our operations teams. We just ,igrated from a proprietary container solution that we developed. That proprietary container supported parallel instantiation. Now, we have to argue why we migrated to Spring ... to us, this is not a minor issue and we would highly appreciate any progress on this.

With regards to the solution it should be something like a "managed task executor" 'cause ideally the concurrent threads have the container managed thread context (JNDI, JPA resources must be accessible etc.). Therefore the solution is not so straight formward imho.

Best regards, Niklas

spring-projects-issues commented 13 years ago

Chris Beams commented

Niklas,

2h 30m is a very long time indeed, but again, I would encourage a more pragmatic approach.

For the majority of Spring applications, container startup time is not an issue. We can be fairly certain of this simply because it is not a complaint we often encounter here in JIRA, in our forums, at conferences or with paying customers. When it does come up as a concern, we usually advise profiling the application to determine exactly which beans are causing the slowdown, and taking specific action to reduce the impact. Marten's suggestion that I quoted above would be perfectly adequate in many cases.

The upside of parallelizing bean initialization in the Spring container could be significant for a minority of applications using Spring, while the downsides - the inevitable bugs, added complexity and unintended side effects - would affect potentially every application using Spring. Not an attractive outlook, I'm afraid.

I'm resolving this issue as Won't Fix because indeed it is very unlikely that we would introduce a change of this magnitude into the core framework at this point without a very strong rationale.

Users are free to reopen this issue and add new comments, and continue to add votes if there are arguments that have not yet been heard.

spring-projects-issues commented 13 years ago

Niklas Schlimm commented

Hi Chris,

understand your view point. Thanks for the comprehensive reply.

Cheers, Niklas

spring-projects-issues commented 12 years ago

Adib Saikali commented

I think parallel startup of Spring is very important, in my application spring is taking up 50% of my startup time, reducing that would be very helpful during development a major time saver.

spring-projects-issues commented 12 years ago

Ragnar Rova commented

Please see https://github.com/gredler/spriths for an experimental example implementation. Seems like the spriths implementation goes far into making it work. The author of spriths mentions #10033 that needs to be solved in order to make parallel startup thread-safe.

It would be nice if the framework at least did not stand in the way of users wanting parallel startup.

spring-projects-issues commented 11 years ago

Jonatan Jönsson commented

I think a ConfigurableListableBeanFactory#setConcurrentInstantiationOfSingletons(boolean) would be great. Default to non-concurrent but make it super easy to get a parallel version.

spring-projects-issues commented 11 years ago

Scott Murphy commented

For a single instance web application, the speed of Spring initialization is fine. However, when you have an application that uses 20+ instances, the slowness of Spring initialization begins to a have a detrimental impact on the ability to scale dynamically. We are currently feeling major pain using Spring on App Engine. If Spring supported a multi-threaded start up, we would see a significant improvement in our ability to scale as well benefit from enormous cost savings.

spring-projects-issues commented 11 years ago

Scott Murphy commented

Is this worth revisiting now that it is 2 years later?

spring-projects-issues commented 11 years ago

Scott Murphy commented

"Users are free to reopen this issue"

Maybe keep this issue open so it can be voted upon?

spring-projects-issues commented 10 years ago

Adib Saikali commented

I think this problem has two distinct parts parallel discovery of beans and parallel initialization of beans. Both of which can be implemented separately to improve performance unless component scanning for a large application can not be made any faster.

spring-projects-issues commented 10 years ago

Julien Dubois commented

Start-up performance is a very important issue to me, and I think a lot of beans could be initialized in parallel. So yes this issue should stay open, it should even have a higher criticity level as far as I'm concerned.

spring-projects-issues commented 9 years ago

Attila Király commented

Our applications have really big spring contexts with a lof ot spring beans. Our cache initializations are already running in parallel but the applications still take a long time to start up because simply some of our beans require time (for example because loading reference data from various sources, which we need for other bean creation already) in creation/initialization before we can say they are "done".

So we would definitely benefit from a build in bean init parallelization.

spring-projects-issues commented 9 years ago

Deryl Spielman commented

We are struggling with this as well. Lazy loading beans helped a bit but I believe @Controller beans end up lazy loading its autowired fields anyway which doesn't improve the speed that much. Also entity manager and other data source beans in the config must be carefully set to eagerly be loaded, which is cumbersome. With all of this talk of microservices and Spring Boot it is obvious that separating out the modules in to separate projects may improve the speed, but this increases the overhead of managing independently deployed and managed services + it is a project in its own to figure out how to refactor the Spring bean autowiring dependencies.

I also attempted to define all beans in XML dynamically at startup using a custom implemented component scanning method found in some of these links. I found that it did improve the speed but not dramatically. The maintenance of ensuring the bean was dynamically created correctly was also cumbersome and did not work completely based on the complexities in having @Scope and @Qualifier.

Summary is we need monolith applications to increase their speed time and this ticket should be evaluated! Thanks.

spring-projects-issues commented 9 years ago

caiwei commented

I'm thinking to enhance below project: https://github.com/gredler/spriths The basic idea of this project is Analyze bean dependencies -> Form DAG -> Parallel Bean Initialize.

To analyze the dependencies, I should handle dependencies that are set via Setter, Constructor and annotations including: @Resource TYPE, FIELD, METHOD @Inject METHOD, CONSTRUCTOR, FIELD @AutoWired METHOD, CONSTRUCTOR, FIELD For dependencies introduced by ApplicationContextAware, Lookup method injection, Class.forName, Reflect..., may unable to handle...

User is allowed to specify the parallel bean init in a context.xml as below:

a DAG is quite straightforward. \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

User can always let Spring generate a dag file or specify a dag file as they wish(set alwaysGenerate="false"). We know It is almost impossible for users to create the DAG from scratch. They have to know the bean dependencies and find out all beans defined via annotation. The typical use case is user will first generate the DAG, then they may adjust the sequences in DAG if there are any server startup failures caused by the concurrent bean initialization.

Just my 2 cents. Please share your expertise and let me know your concerns.

spring-projects-issues commented 8 years ago

Juergen Hoeller commented

Note that we put #18305 onto the 4.3 roadmap now, with a dedicated background initialization option in LocalContainerEntityManagerFactoryBean / LocalSessionFactoryBean that allows to run JPA / Hibernate initialization in parallel to all other beans in the context. We do not intend to let the container initialize beans in parallel there; we just allow those two FactoryBean implementations to internally delegate to a separate thread and lazily access the initialized result through a Future handle. Such specific background initializations options seem like a sweet spot, with configuration validation and dependency resolution happening as usual and just a well-known expensive bootstrap step delegated to a separate thread internally.

A generalized solution based on a DAG analysis of a container's bean interdependencies is unfortunately absolutely non-trivial. We have plenty of components which dynamically resolve dependencies at runtime or make auto-configuration decisions based on the presence / non-presence of beans at runtime, etc. It seems like asking for a lot of trouble when trying to generalize such parallel bootstrapping as container-level guesswork, with hard-to-predict benefits in comparison to allowing well-known expensive components to use background initialization internally. And based on our experience, a large chunk of the startup time is occupied by very few components even in large Spring applications; tackling those components specifically and seeing how far we get with that seems like a very worthwhile effort.

In any case, the most important part: We are revisiting this topic in 4.3, and we may take it a few steps further in 5.0. If you have specific hotspot insight into the startup time of your applications, please let us know... in particular if it deviates from our assumptions above.

Juergen

spring-projects-issues commented 8 years ago

Julien Dubois commented

If we could just have a specific annotation, like @Background to load a bean in the background, it would be nice: a bit like the @Lazy annotation in fact. Anyway each project is different, and usually people know which beans should run in the background.

There should also be a specific thread pool for those, so we can manage how those background beans run -> like you have 4 CPU cores, launch 6 beans in "background", and run them on 2 threads...

spring-projects-issues commented 8 years ago

Juergen Hoeller commented

For specifically demarcated beans, it's a much simpler problem for sure. You may still end up with deadlocks in case of dependency cycles etc but hey, why not let badly designed applications hang on startup right away ;-) Seriously, opting in for specific beans is certainly the way to go - in one way or the other.

Depending on the kind of bean, internal delegation to a separate thread can be quite beneficial, with dependency resolution and configuration validation happening first and then e.g. just the buildSessionFactory() call actually executing in a separate thread. This seems like a pretty sweet spot, as long as nobody tries to call the resulting proxy early.

Whereas for other kinds of beans, a more generic @Background marker could wrap the entire createBean step in a separate thread. The question is just what it would return to the immediate caller then, since the bean factory itself is not in the business of creating proxies. We had the same problem with @Lazy-triggered proxies for injection points, though, and solved it through an SPI call that the context package implements on a proxy basis; I guess we could do something similar here.

In any case, both options sound worth exploring. Let's keep this JIRA ticket open for an @Background-style model (probably rather 5.0) and #18305 for the specific Hibernate/JPA factory case (certainly 4.3).

Juergen

spring-projects-issues commented 8 years ago

Rohit Gupta commented

This functionality is really awesome if it comes with the framework as in our case, our architect kept on cursing Spring for delayed startup.

I just need to highlight one thing for parallel loading of beans. Do figure out a way if someone tries to use a bean early as in our case we have our own encryption mechanism which gets initialized early with ROOT application context through a component that once initialized, initializes AWS Component that connects S3, downloads properties, decrypts and provides to the application which is resolved by PropertyPlaceholderConfigurer which is later overridden at the instance level.

I just fear, if parallel initialization occurs, it should not provide wrong properties to @Value resolved variables. Also, if two beans are interdependent, It should take care. Like any two services depend on each other during lazy initialization.

DAG thing won't work as if you ask people to generate DAG, in most cases people won't be able to do that perfectly and later will curse the framework. If the framework does that out of the box, then it will be great.

spring-projects-issues commented 8 years ago

Vladislav Kaverin commented

This issue is one of the most voted of unresolved ones in the Spring Framework JIRA project and is desired by various Spring-users for 5 years already (starting on the ticket creation date). Doesn't it deserve a priority higher than Minor? ;)

spring-projects-issues commented 8 years ago

Juergen Hoeller commented

Alright, bumping this one to "Major", keeping it in the 5.0 backlog.

Please note that we've been shipping 4.3 with #18305 included already. From our perspective, this addresses the most common case - expensive persistence provider bootstrapping - in a reasonably straightforward way.

spring-projects-issues commented 8 years ago

Vladislav Kaverin commented

@juergen.hoeller, please add performance and startup labels to the ticket, it would help tracking such issues.

spring-projects-issues commented 6 years ago

Abhijit Sarkar commented

With 61 votes, and after 6.5 years since it was opened, doesn't his ticket deserve more attention than it is getting (which is zero, last comment was a year ago)?

spring-projects-issues commented 6 years ago

Filip Panovski commented

Has there been any progress on this issue? Is this planned for any release's roadmap as of right now?

spring-projects-issues commented 5 years ago

Serdar Osman Onur commented

Voted & Watching

guanchao-yang commented 5 years ago

Too much expectation!

yiliaofan commented 4 years ago

Hi, has this problem been solved? @spring-issuemaster thanks~ It takes 2 minutes for our database resources to load; the startup time is too long. . It takes 5 ~ 6 minutes to start the entire application.

Is there any new progress? Thank you!

fzyzcjy commented 4 years ago

Hi is there any updates...? It has been 9 years :( Spring is sooooo powerful and widely used so IMHO this feature can make thousands (if not millions) of developers and ops' life better!

naakax commented 4 years ago

please do this issue, it's really important!!

guanchao-yang commented 3 years ago

Parallel bean initialization issue has bean 10 years!!! Expecting General Backlog can be released as quickly as possible!!! 👍 💯

ofdata commented 3 years ago

In the cloud native , parallel bean initialization is important, we need fast startup.

maciekdragan commented 3 years ago

The issue is 10 years old now. Happy birthday!!! 🎂 🎂 🎂

mazingcai commented 2 years ago

Watching

abbothzhang commented 2 years ago

Happy 11th birthday!

hehuang139 commented 2 years ago

oh!, Is there any progress on this issue? It's so important

mangusbrother commented 2 years ago

Something like this is what is causing us to move to frameworks like Quarkus who give priority to startup time.

This would speed up the really slow startup of spring at a bare minimum.

alwinlin23 commented 2 years ago

Any happy news?

bclozel commented 2 years ago

This issue is not scheduled for the upcoming 6.0 version, as it remains in the general backlog for now.

While working on real startup cases, we've often found that time is spent in a handful of specific beans that depend on each other. Introducing parallelization there would not save much in many cases as this doesn't speed up the critical path. This is often related to ORM setup and database migrations.

You can collect more information about this for your own application using the application startup tracking: you should see where and how startup time is being spent and whether parallelization would improve the situation.

For Spring Framework 6.0, we are focusing on Ahead Of Time features for both native use cases as well as startup time improvements.

dikaewiwurae7 commented 2 years ago

today is 2022.11.18

singasong1995 commented 1 year ago

Marked.

biuabiu commented 1 year ago

I hope the spring team can take this issue seriously. My current project has an average startup time of over 5 minutes

bclozel commented 1 year ago

@biuabiu it's very unlikely that this feature would help in your case. A single component is probably responsible for most of the startup time.

Can you try measuring startup time with the tools we provide? See https://docs.spring.io/spring-framework/reference/core/beans/context-introduction.html#context-functionality-startup if your app is plain Spring or https://docs.spring.io/spring-boot/docs/current/reference/html/features.html#features.spring-application.startup-tracking if this is a Spring Boot app. You can send the data our way if you need help analyzing it.

biuabiu commented 1 year ago

@biuabiu it's very unlikely that this feature would help in your case. A single component is probably responsible for most of the startup time.

Can you try measuring startup time with the tools we provide? See https://docs.spring.io/spring-framework/reference/core/beans/context-introduction.html#context-functionality-startup if your app is plain Spring or https://docs.spring.io/spring-boot/docs/current/reference/html/features.html#features.spring-application.startup-tracking if this is a Spring Boot app. You can send the data our way if you need help analyzing it.

ths for reply,in project,some bean initialization tasks,but there are dependencies some bean

nicolasmafraintelipost commented 1 year ago

Could the bean initialization approach be delegated to a library? In that case, most of the Spring application would continue to use the same initialization approach. Those who want parallel bean initialization can include library to override behavior.

jhoeller commented 9 months ago

Following up on the locking revision in #23501, we are introducing a backgroundInit flag on AbstractBeanDefinition and a corresponding @Bean(bootstrap=BACKGROUND) enum. This is still a bit hot from the oven, a first cut to be committed soon.

This new setting allows for singling out specific beans for background initialization, covering the entire getBean step for each such bean in preInstantiateSingletons. The corresponding Future is stored internally so that dependent beans automatically wait for the bean instance to be completed. Also, all regular background initializations are forced to complete at the end of preInstantiateSingletons; only for beans additionally marked as @Lazy, the completion is allowed to happen later (up until first actual access).

Note that this typically goes together with @Lazy (or ObjectProvider<...>) injection points, otherwise the main bootstrap thread is going to block when a background-initialized bean needs to be injected. Also, background initialization applies to individual beans: If such a bean depends on other beans, they need to have been initialized already, either simply through being declared earlier or through @DependsOn which is going to enforce initialization in the main bootstrap thread before background initialization for the affected bean is triggered.

Last but not least, a bootstrapExecutor needs to be specified on DefaultListableBeanFactory for this to be actually active. In an ApplicationContext, a bean of that name and of type Executor will be automatically detected. I suppose that Spring Boot will set a corresponding alias for its application task executor by default, but there is also room for a custom executor with a maximum concurrency setting for bootstrapping to be configured - independent from a larger-scale thread pool for other purposes in the same application.