apache / dolphinscheduler

Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
https://dolphinscheduler.apache.org/
Apache License 2.0
12.73k stars 4.58k forks source link

The worker-server fails to start, and a zookeeper connection error is reported when the worker-server starts. #16306

Closed pencoo closed 2 months ago

pencoo commented 2 months ago

Search before asking

What happened

dolphinscheduler版本: 3.1.8 jdk版本:11.0.15.1 2022-04-22 LTS zookeeper版本:3.9.0 mysql版本:8.0.0 worker启动了5台没有问题,第六台启动报错:

[INFO] 2024-06-19 14:51:25.683 +0800 org.apache.zookeeper.ClientCnxnSocket:[239] - [WorkflowInstance-0][TaskInstance-0] - jute.maxbuffer value is 1048575 Bytes [INFO] 2024-06-19 14:51:25.705 +0800 org.apache.zookeeper.ClientCnxn:[1732] - [WorkflowInstance-0][TaskInstance-0] - zookeeper.request.timeout value is 0. feature enabled=false [INFO] 2024-06-19 14:51:25.723 +0800 org.apache.curator.framework.imps.CuratorFrameworkImpl:[386] - [WorkflowInstance-0][TaskInstance-0] - Default schema [INFO] 2024-06-19 14:51:26.326 +0800 org.apache.curator.framework.imps.CuratorFrameworkImpl:[998] - [WorkflowInstance-0][TaskInstance-0] - backgroundOperationsLoop exiting [INFO] 2024-06-19 14:51:45.746 +0800 org.apache.zookeeper.ClientCnxn:[1171] - [WorkflowInstance-0][TaskInstance-0] - Opening socket connection to server 10.9.4.172/10.9.4.172:2181. [INFO] 2024-06-19 14:51:45.746 +0800 org.apache.zookeeper.ClientCnxn:[1173] - [WorkflowInstance-0][TaskInstance-0] - SASL config status: Will not attempt to authenticate using SASL (unknown error) [INFO] 2024-06-19 14:51:45.838 +0800 org.apache.zookeeper.ClientCnxn:[1005] - [WorkflowInstance-0][TaskInstance-0] - Socket connection established, initiating session, client: /10.3.1.211:52730, server: 10.9.4.172/10.9.4.172:2181 [INFO] 2024-06-19 14:51:45.901 +0800 org.apache.zookeeper.ClientCnxn:[1444] - [WorkflowInstance-0][TaskInstance-0] - Session establishment complete on server 10.9.4.172/10.9.4.172:2181, session id = 0x1182d4a7dc00037, negotiated timeout = 30000 [INFO] 2024-06-19 14:51:46.059 +0800 org.apache.zookeeper.ClientCnxn:[568] - [WorkflowInstance-0][TaskInstance-0] - EventThread shut down for session: 0x1182d4a7dc00037 [INFO] 2024-06-19 14:51:46.059 +0800 org.apache.zookeeper.ZooKeeper:[1232] - [WorkflowInstance-0][TaskInstance-0] - Session: 0x1182d4a7dc00037 closed [WARN] 2024-06-19 14:51:46.062 +0800 org.springframework.boot.web.servlet.context.AnnotationConfigServletWebServerApplicationContext:[591] - [WorkflowInstance-0][TaskInstance-0] - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerServer': Unsatisfied dependency expressed through field 'workerRegistryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerRegistryClient': Unsatisfied dependency expressed through field 'registryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'registryClient' defined in URL [jar:file:/usr/local/dolphinscheduler/worker-server/libs/dolphinscheduler-service-3.1.8.jar!/org/apache/dolphinscheduler/service/registry/RegistryClient.class]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'zookeeperRegistry': Invocation of init method failed; nested exception is org.apache.dolphinscheduler.registry.api.RegistryException: zookeeper connect timeout: 10.9.4.172:2181 [WARN] 2024-06-19 14:51:46.063 +0800 org.apache.dolphinscheduler.service.alert.AlertClientService:[76] - [WorkflowInstance-0][TaskInstance-0] - Alert client is already closed [WARN] 2024-06-19 14:51:46.066 +0800 org.springframework.beans.factory.support.DisposableBeanAdapter:[248] - [WorkflowInstance-0][TaskInstance-0] - Invocation of close method failed on bean with name 'springApplicationContext': org.springframework.beans.factory.BeanCreationNotAllowedException: Error creating bean with name 'applicationAvailability': Singleton bean creation not allowed while singletons of this factory are in destruction (Do not request a bean from a BeanFactory in a destroy method implementation!) [INFO] 2024-06-19 14:51:46.073 +0800 org.eclipse.jetty.server.session:[149] - [WorkflowInstance-0][TaskInstance-0] - node0 Stopped scavenging [INFO] 2024-06-19 14:51:46.077 +0800 org.eclipse.jetty.server.handler.ContextHandler:[1159] - [WorkflowInstance-0][TaskInstance-0] - Stopped o.s.b.w.e.j.JettyEmbeddedWebAppContext@4930213b{application,/,[file:///tmp/jetty-docbase.1235.11639236838567744432/],STOPPED} [INFO] 2024-06-19 14:51:46.112 +0800 org.springframework.boot.autoconfigure.logging.ConditionEvaluationReportLoggingListener:[136] - [WorkflowInstance-0][TaskInstance-0] -

Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled. [ERROR] 2024-06-19 14:51:46.166 +0800 org.springframework.boot.SpringApplication:[824] - [WorkflowInstance-0][TaskInstance-0] - Application run failed org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerServer': Unsatisfied dependency expressed through field 'workerRegistryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerRegistryClient': Unsatisfied dependency expressed through field 'registryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'registryClient' defined in URL [jar:file:/usr/local/dolphinscheduler/worker-server/libs/dolphinscheduler-service-3.1.8.jar!/org/apache/dolphinscheduler/service/registry/RegistryClient.class]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'zookeeperRegistry': Invocation of init method failed; nested exception is org.apache.dolphinscheduler.registry.api.RegistryException: zookeeper connect timeout: 10.9.4.172:2181 at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.resolveFieldValue(AutowiredAnnotationBeanPostProcessor.java:659) at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:639) at org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:119) at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessProperties(AutowiredAnnotationBeanPostProcessor.java:399) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1431) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:619) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542) at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335) at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234) at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333) at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208) at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:953) at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:147) at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:734) at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:408) at org.springframework.boot.SpringApplication.run(SpringApplication.java:308) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1295) at org.apache.dolphinscheduler.server.worker.WorkerServer.main(WorkerServer.java:110) Caused by: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerRegistryClient': Unsatisfied dependency expressed through field 'registryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'registryClient' defined in URL [jar:file:/usr/local/dolphinscheduler/worker-server/libs/dolphinscheduler-service-3.1.8.jar!/org/apache/dolphinscheduler/service/registry/RegistryClient.class]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'zookeeperRegistry': Invocation of init method failed; nested exception is org.apache.dolphinsched

What you expected to happen

不确定是否是zookeeper版本太新或者是跨网络延迟所致

How to reproduce

新增节点与原集群不在一个网络,是两个网络夸专线联通

Anything else

No response

Version

3.1.x

Are you willing to submit PR?

Code of Conduct

github-actions[bot] commented 2 months ago

Search before asking

What happened

dolphinscheduler版本: 3.1.8 jdk版本:11.0.15.1 2022-04-22 LTS zookeeper版本:3.9.0 mysql版本:8.0.0 worker启动了5台没有问题,第六台启动报错:

[INFO] 2024-06-19 14:51:25.683 +0800 org.apache.zookeeper.ClientCnxnSocket:[239] - [WorkflowInstance-0][TaskInstance-0] - jute.maxbuffer value is 1048575 Bytes [INFO] 2024-06-19 14:51:25.705 +0800 org.apache.zookeeper.ClientCnxn:[1732] - [WorkflowInstance-0][TaskInstance-0] - zookeeper.request.timeout value is 0. feature enabled=false [INFO] 2024-06-19 14:51:25.723 +0800 org.apache.curator.framework.imps.CuratorFrameworkImpl:[386] - [WorkflowInstance-0][TaskInstance-0] - Default schema [INFO] 2024-06-19 14:51:26.326 +0800 org.apache.curator.framework.imps.CuratorFrameworkImpl:[998] - [WorkflowInstance-0][TaskInstance-0] - backgroundOperationsLoop exiting [INFO] 2024-06-19 14:51:45.746 +0800 org.apache.zookeeper.ClientCnxn:[1171] - [WorkflowInstance-0][TaskInstance-0] - Opening socket connection to server 10.9.4.172/10.9.4.172:2181. [INFO] 2024-06-19 14:51:45.746 +0800 org.apache.zookeeper.ClientCnxn:[1173] - [WorkflowInstance-0][TaskInstance-0] - SASL config status: Will not attempt to authenticate using SASL (unknown error) [INFO] 2024-06-19 14:51:45.838 +0800 org.apache.zookeeper.ClientCnxn:[1005] - [WorkflowInstance-0][TaskInstance-0] - Socket connection established, initiating session, client: /10.3.1.211:52730, server: 10.9.4.172/10.9.4.172:2181 [INFO] 2024-06-19 14:51:45.901 +0800 org.apache.zookeeper.ClientCnxn:[1444] - [WorkflowInstance-0][TaskInstance-0] - Session establishment complete on server 10.9.4.172/10.9.4.172:2181, session id = 0x1182d4a7dc00037, negotiated timeout = 30000 [INFO] 2024-06-19 14:51:46.059 +0800 org.apache.zookeeper.ClientCnxn:[568] - [WorkflowInstance-0][TaskInstance-0] - EventThread shut down for session: 0x1182d4a7dc00037 [INFO] 2024-06-19 14:51:46.059 +0800 org.apache.zookeeper.ZooKeeper:[1232] - [WorkflowInstance-0][TaskInstance-0] - Session: 0x1182d4a7dc00037 closed [WARN] 2024-06-19 14:51:46.062 +0800 org.springframework.boot.web.servlet.context.AnnotationConfigServletWebServerApplicationContext:[591] - [WorkflowInstance-0][TaskInstance-0] - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerServer': Unsatisfied dependency expressed through field 'workerRegistryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerRegistryClient': Unsatisfied dependency expressed through field 'registryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'registryClient' defined in URL [jar:file:/usr/local/dolphinscheduler/worker-server/libs/dolphinscheduler-service-3.1.8.jar!/org/apache/dolphinscheduler/service/registry/RegistryClient.class]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'zookeeperRegistry': Invocation of init method failed; nested exception is org.apache.dolphinscheduler.registry.api.RegistryException: zookeeper connect timeout: 10.9.4.172:2181 [WARN] 2024-06-19 14:51:46.063 +0800 org.apache.dolphinscheduler.service.alert.AlertClientService:[76] - [WorkflowInstance-0][TaskInstance-0] - Alert client is already closed [WARN] 2024-06-19 14:51:46.066 +0800 org.springframework.beans.factory.support.DisposableBeanAdapter:[248] - [WorkflowInstance-0][TaskInstance-0] - Invocation of close method failed on bean with name 'springApplicationContext': org.springframework.beans.factory.BeanCreationNotAllowedException: Error creating bean with name 'applicationAvailability': Singleton bean creation not allowed while singletons of this factory are in destruction (Do not request a bean from a BeanFactory in a destroy method implementation!) [INFO] 2024-06-19 14:51:46.073 +0800 org.eclipse.jetty.server.session:[149] - [WorkflowInstance-0][TaskInstance-0] - node0 Stopped scavenging [INFO] 2024-06-19 14:51:46.077 +0800 org.eclipse.jetty.server.handler.ContextHandler:[1159] - [WorkflowInstance-0][TaskInstance-0] - Stopped o.s.b.w.e.j.JettyEmbeddedWebAppContext@4930213b{application,/,[file:///tmp/jetty-docbase.1235.11639236838567744432/],STOPPED} [INFO] 2024-06-19 14:51:46.112 +0800 org.springframework.boot.autoconfigure.logging.ConditionEvaluationReportLoggingListener:[136] - [WorkflowInstance-0][TaskInstance-0] -

Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled. [ERROR] 2024-06-19 14:51:46.166 +0800 org.springframework.boot.SpringApplication:[824] - [WorkflowInstance-0][TaskInstance-0] - Application run failed org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerServer': Unsatisfied dependency expressed through field 'workerRegistryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerRegistryClient': Unsatisfied dependency expressed through field 'registryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'registryClient' defined in URL [jar:file:/usr/local/dolphinscheduler/worker-server/libs/dolphinscheduler-service-3.1.8.jar!/org/apache/dolphinscheduler/service/registry/RegistryClient.class]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'zookeeperRegistry': Invocation of init method failed; nested exception is org.apache.dolphinscheduler.registry.api.RegistryException: zookeeper connect timeout: 10.9.4.172:2181 at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.resolveFieldValue(AutowiredAnnotationBeanPostProcessor.java:659) at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:639) at org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:119) at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessProperties(AutowiredAnnotationBeanPostProcessor.java:399) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1431) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:619) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542) at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335) at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234) at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333) at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208) at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:953) at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:147) at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:734) at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:408) at org.springframework.boot.SpringApplication.run(SpringApplication.java:308) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1295) at org.apache.dolphinscheduler.server.worker.WorkerServer.main(WorkerServer.java:110) Caused by: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'workerRegistryClient': Unsatisfied dependency expressed through field 'registryClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'registryClient' defined in URL [jar:file:/usr/local/dolphinscheduler/worker-server/libs/dolphinscheduler-service-3.1.8.jar!/org/apache/dolphinscheduler/service/registry/RegistryClient.class]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'zookeeperRegistry': Invocation of init method failed; nested exception is org.apache.dolphinsched

What you expected to happen

不确定是否是zookeeper版本太新或者是跨网络延迟所致

How to reproduce

新增节点与原集群不在一个网络,是两个网络夸专线联通

Anything else

No response

Version

3.1.x

Are you willing to submit PR?

Code of Conduct

ruanwenjun commented 2 months ago

You need to increase connection-timeout setting

registry:
  type: zookeeper
  zookeeper:
    namespace: dolphinscheduler
    connect-string: localhost:2181
    retry-policy:
      base-sleep-time: 1s
      max-sleep: 3s
      max-retries: 5
    session-timeout: 60s
    connection-timeout: 15s
    block-until-connected: 15s
    digest: ~