xenit-eu / dynamic-extensions-for-alfresco

Rapid development of Alfresco repository extensions in Java. Deploy your code in seconds, not minutes. Life is too short for endless server restarts.
Apache License 2.0
76 stars 31 forks source link

Possible failure to start Alfresco node in a clustered environment #337

Open kpiot123 opened 2 years ago

kpiot123 commented 2 years ago

In a clustered environment it is possible for some Alfresco nodes to fail to start due to an exception related to dynamic-extensions. It happens when the osgi container module component is executed more than once while starting Alfresco. The reason why it is started multiple times is unknown.

This issue only seems to occur when all the cluster nodes are started at the same time. If the cluster nodes are start one by one with a short pause in between the issue does not occur.

I'm submitting a ... (check one with "x")

[x] bug report
[ ] feature request
[ ] question

Expected Behavior

The osgi container module component is only executed once on start-up of Alfresco.

Current Behavior

Sometimes an exception occurs on start-up of Alfresco:

2021-10-26 08:33:22,729  INFO  [repo.module.ModuleServiceImpl] [localhost-startStop-1] Starting module 'alfresco-dynamic-extensions-repo' version 2.0.5.
 2021-10-26 08:33:22,743  ERROR [web.context.ContextLoader] [localhost-startStop-1] Context initialization failed
 org.alfresco.error.AlfrescoRuntimeException: 09260034 Failed to start modules
    at org.alfresco.repo.module.ModuleComponentHelper$1.doWork(ModuleComponentHelper.java:326)
    at org.alfresco.repo.security.authentication.AuthenticationUtil.runAs(AuthenticationUtil.java:602)
    at org.alfresco.repo.module.ModuleComponentHelper.startModules(ModuleComponentHelper.java:213)
    at org.alfresco.repo.module.ModuleServiceImpl.startModules(ModuleServiceImpl.java:149)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
    at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:295)
    at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
    at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
    at com.sun.proxy.$Proxy151.startModules(Unknown Source)
    at org.alfresco.repo.module.ModuleStarter$1.execute(ModuleStarter.java:73)
    at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:450)
    at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:338)
    at org.alfresco.repo.module.ModuleStarter$2.doWork(ModuleStarter.java:83)
    at org.alfresco.repo.security.authentication.AuthenticationUtil.runAs(AuthenticationUtil.java:602)
    at org.alfresco.repo.module.ModuleStarter.onBootstrap(ModuleStarter.java:78)
    at org.springframework.extensions.surf.util.AbstractLifecycleBean.onApplicationEvent(AbstractLifecycleBean.java:56)
    at org.alfresco.repo.management.SafeApplicationEventMulticaster.multicastEventInternal(SafeApplicationEventMulticaster.java:221)
    at org.alfresco.repo.management.SafeApplicationEventMulticaster.multicastEvent(SafeApplicationEventMulticaster.java:186)
    at org.alfresco.repo.management.SafeApplicationEventMulticaster.multicastEvent(SafeApplicationEventMulticaster.java:206)
    at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:402)
    at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:359)
    at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:896)
    at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:552)
    at org.springframework.web.context.ContextLoader.configureAndRefreshWebApplicationContext(ContextLoader.java:400)
    at org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:291)
    at org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:103)
    at org.alfresco.web.app.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:70)
    at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4792)
    at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5256)
    at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
    at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:754)
    at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:730)
    at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:734)
    at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:985)
    at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1857)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.alfresco.error.AlfrescoRuntimeException: 09260033 The module component has already been executed: alfresco-dynamic-extensions-repo.osgi.container.ModuleComponent
    at org.alfresco.error.AlfrescoRuntimeException.create(AlfrescoRuntimeException.java:52)
    at org.alfresco.repo.module.AbstractModuleComponent.execute(AbstractModuleComponent.java:354)
    at org.alfresco.repo.module.ModuleComponentHelper.executeComponent(ModuleComponentHelper.java:736)
    at org.alfresco.repo.module.ModuleComponentHelper.startModule(ModuleComponentHelper.java:647)
    at org.alfresco.repo.module.ModuleComponentHelper.access$5(ModuleComponentHelper.java:530)
    at org.alfresco.repo.module.ModuleComponentHelper$1$1.execute(ModuleComponentHelper.java:263)
    at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:450)
    at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:338)
    at org.alfresco.repo.module.ModuleComponentHelper$1.doWork(ModuleComponentHelper.java:284)

At that point the osgi container module component is about to be executed a second time.

Possible Solution

There is no solution yet. The only work-around is to start the cluster nodes one by one with a short pause in between.

Steps to Reproduce (for bugs)

The issue can't consistently be reproduced. It only happens sporadically

Your Environment

Alfresco log

catalina.log

todorinskiz commented 2 years ago

I don't believe this is a DE issue, rather a bug in Alfresco?

2022-01-27 22:02:12,331 INFO  [org.alfresco.repo.admin] [localhost-startStop-1] Using database URL 'jdbc:mysql://alf-db-nl/alfresco?useUnicode=yes&characterEncoding=UTF-8' with user 'alfresco'.
2022-01-27 22:02:13,521 INFO  [org.alfresco.repo.admin] [localhost-startStop-1] Connected to database MySQL version 5.6.25-log
2022-01-27 22:02:20,540 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Starting 'sysAdmin' subsystem, ID: [sysAdmin, default]
2022-01-27 22:02:20,573 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Startup of 'sysAdmin' subsystem, ID: [sysAdmin, default] complete
2022-01-27 22:02:30,801 INFO  [org.springframework.extensions.webscripts.TemplateProcessorRegistry] [localhost-startStop-1] Registered template processor Repository Template Processor for extension ftl
2022-01-27 22:02:30,810 INFO  [org.springframework.extensions.webscripts.ScriptProcessorRegistry] [localhost-startStop-1] Registered script processor Repository Script Processor for extension js
2022-01-27 22:02:37,674 INFO  [org.alfresco.repo.domain.schema.SchemaBootstrap] [localhost-startStop-1] Connecting to database: jdbc:mysql://database/alfresco?useUnicode=yes&characterEncoding=UTF-8, UserName=alfresco@someapi, MySQL Connector Java
2022-01-27 22:02:37,674 INFO  [org.alfresco.repo.domain.schema.SchemaBootstrap] [localhost-startStop-1] Schema managed by database dialect org.hibernate.dialect.MySQLInnoDBDialect.
2022-01-27 22:02:39,402 INFO  [org.alfresco.repo.domain.schema.SchemaBootstrap] [localhost-startStop-1] No changes were made to the schema.
2022-01-27 22:02:42,589 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Starting 'Authentication' subsystem, ID: [Authentication, managed, kerberos1]
2022-01-27 22:02:43,387 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Startup of 'Authentication' subsystem, ID: [Authentication, managed, kerberos1] complete
2022-01-27 22:02:43,387 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Starting 'Authentication' subsystem, ID: [Authentication, managed, ldap1]
2022-01-27 22:02:43,459 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Startup of 'Authentication' subsystem, ID: [Authentication, managed, ldap1] complete
2022-01-27 22:02:46,336 INFO  [org.alfresco.enterprise.repo.sync.SyncAdminServiceImpl] [localhost-startStop-1] A key is provided for cloud sync
2022-01-27 22:02:46,463 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Starting 'thirdparty' subsystem, ID: [thirdparty, default]
2022-01-27 22:02:46,748 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Startup of 'thirdparty' subsystem, ID: [thirdparty, default] complete
2022-01-27 22:02:46,748 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Starting 'OOoDirect' subsystem, ID: [OOoDirect, default]
2022-01-27 22:02:46,789 WARN  [org.alfresco.util.AbstractTriggerBean] [localhost-startStop-1] Job openOfficeConnectionTesterTrigger is not enabled
2022-01-27 22:02:46,884 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Startup of 'OOoDirect' subsystem, ID: [OOoDirect, default] complete
2022-01-27 22:02:46,884 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Starting 'OOoJodconverter' subsystem, ID: [OOoJodconverter, default]
2022-01-27 22:02:46,919 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Startup of 'OOoJodconverter' subsystem, ID: [OOoJodconverter, default] complete
2022-01-27 22:02:49,099 INFO  [org.alfresco.enterprise.repo.cluster.core.ClusteringBootstrap] [localhost-startStop-1] Cluster started, name: MainRepository-e13ab8cf-f9ae-44e6-9a5d-3ef978abc557
2022-01-27 22:02:49,118 INFO  [org.alfresco.enterprise.repo.cluster.core.ClusteringBootstrap] [localhost-startStop-1] Current cluster members:
  172.xx.xx.xx:5701 (hostname: pvx9zzzzz.xy.gz.be)

2022-01-27 22:02:49,125 INFO  [org.alfresco.repo.admin.ConfigurationChecker] [localhost-startStop-1] The Alfresco root data directory ('dir.root') is: /opt/pv/data/tomcat/alfresco
2022-01-27 22:02:49,125 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Starting 'Search' subsystem, ID: [Search, managed, solr]
2022-01-27 22:02:49,340 INFO  [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [localhost-startStop-1] Startup of 'Search' subsystem, ID: [Search, managed, solr] complete
2022-01-27 22:02:49,440 INFO  [org.alfresco.repo.admin.patch.PatchExecuter] [localhost-startStop-1] Checking for patches to apply ...
2022-01-27 22:03:06,475 INFO  [org.alfresco.repo.admin.patch.PatchExecuter] [localhost-startStop-1] No patches were required.
2022-01-27 22:03:06,529 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Found 11 module(s).
2022-01-27 22:03:06,784 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Starting module 'system-webscripts' version 1.0.2.
2022-01-27 22:03:06,944 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Starting module 'support-tools' version 1.11.3.
2022-01-27 22:03:07,131 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Starting module 'solr_14_hotfix' version 1.0.0.
2022-01-27 22:03:07,331 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Starting module 'alfresco-dynamic-extensions-repo' version 1.7.4.
2022-01-27 22:04:30,270 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Found 11 module(s).
2022-01-27 22:04:30,495 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Starting module 'system-webscripts' version 1.0.2.
2022-01-27 22:04:30,709 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Starting module 'support-tools' version 1.11.3.
2022-01-27 22:04:30,908 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Starting module 'solr_14_hotfix' version 1.0.0.
2022-01-27 22:04:31,117 INFO  [org.alfresco.repo.module.ModuleServiceImpl] [localhost-startStop-1] Starting module 'alfresco-dynamic-extensions-repo' version 1.7.4.