Closed BigeYoung closed 4 years ago
@BigeYoung thanks for reporting this. can you please try something very simple. Don't call the release zeebe
call it something like my-zeebe
. I have the feeling that there is an issue with naming it zeebe
, we can check that when you confirm that is working.
Also check that you are not out of resources in your cluster.
Can you please check those two things and get back to us?
Thank you very much for your prompt reply. I changed the installation name to "myzeebe" according to your request, but unfortunately, the problem still exists. I installed zeebe with:
➜ ~ helm install myzeebe zeebe/zeebe-full
NAME: myzeebe
LAST DEPLOYED: Wed Oct 21 16:40:21 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
______ ______ ______ ______ ______
/\___ \ /\ ___\ /\ ___\ /\ == \ /\ ___\
\/_/ /__ \ \ __\ \ \ __\ \ \ __< \ \ __\
/\_____\ \ \_____\ \ \_____\ \ \_____\ \ \_____\
\/_____/ \/_____/ \/_____/ \/_____/ \/_____/
(zeebe-full - 0.0.107)
- Cluster Name: myzeebe-zeebe
As shown in the figure below, my node resources are very sufficient.
@BigeYoung thanks for trying that out.. I will need to investigate.. Can you please go into the pod description to see why the pod is being killed? I am more interested in the zeebe broker than Operate..
Doing a kubectl describe pod ...
should show you the events and why the pod was restarted..
I think what I posted at the beginning was exactly the broker log. But since you asked, I am happy to post the log again.
➜ ~ kubectl describe pod myzeebe-zeebe-2
Name: myzeebe-zeebe-2
Namespace: default
Priority: 0
Node: server4/192.168.137.124
Start Time: Wed, 21 Oct 2020 16:40:27 +0800
Labels: app.kubernetes.io/component=broker
app.kubernetes.io/instance=myzeebe
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=zeebe
controller-revision-hash=myzeebe-zeebe-b8f64bbb8
statefulset.kubernetes.io/pod-name=myzeebe-zeebe-2
Annotations: <none>
Status: Running
IP: 10.244.1.139
IPs:
IP: 10.244.1.139
Controlled By: StatefulSet/myzeebe-zeebe
Containers:
zeebe:
Container ID: docker://140d8d7c2770d8942480ff25a2d099ad8933a562cb389822b019529df0ed8e45
Image: camunda/zeebe:0.24.2
Image ID: docker-pullable://camunda/zeebe@sha256:795ace31c498ad4bc37b7b0fab612307c34852f4187766e3f777a509821c9fb3
Ports: 9600/TCP, 26501/TCP, 26502/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Thu, 22 Oct 2020 14:25:06 +0800
Finished: Thu, 22 Oct 2020 14:26:01 +0800
Ready: False
Restart Count: 222
Limits:
cpu: 1
memory: 4Gi
Requests:
cpu: 500m
memory: 2Gi
Readiness: http-get http://:9600/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
ZEEBE_BROKER_CLUSTER_CLUSTERNAME: myzeebe-zeebe
ZEEBE_LOG_LEVEL:
ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT: 3
ZEEBE_BROKER_CLUSTER_CLUSTERSIZE: 3
ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR: 3
ZEEBE_BROKER_THREADS_CPUTHREADCOUNT: 2
ZEEBE_BROKER_THREADS_IOTHREADCOUNT: 2
ZEEBE_BROKER_GATEWAY_ENABLE: false
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_CLASSNAME: io.zeebe.exporter.ElasticsearchExporter
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_URL: http://elasticsearch-master:9200
ZEEBE_BROKER_NETWORK_COMMANDAPI_PORT: 26501
ZEEBE_BROKER_NETWORK_INTERNALAPI_PORT: 26502
ZEEBE_BROKER_NETWORK_MONITORINGAPI_PORT: 9600
K8S_POD_NAME: myzeebe-zeebe-2 (v1:metadata.name)
JAVA_TOOL_OPTIONS: -XX:MaxRAMPercentage=25.0 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/zeebe/data -XX:ErrorFile=/usr/local/zeebe/data/zeebe_error%p.log -XX:+ExitOnOutOfMemoryError
Mounts:
/exporters from exporters (rw)
/usr/local/bin/startup.sh from config (rw,path="startup.sh")
/usr/local/zeebe/config/application.yaml from config (rw,path="application.yaml")
/usr/local/zeebe/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-mt74h (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-myzeebe-zeebe-2
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: myzeebe
Optional: false
exporters:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
default-token-mt74h:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-mt74h
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 2m54s (x5111 over 21h) kubelet Back-off restarting failed container
➜ ~ kubectl logs myzeebe-zeebe-2
++ hostname -f
+ export ZEEBE_BROKER_NETWORK_ADVERTISEDHOST=myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local
+ ZEEBE_BROKER_NETWORK_ADVERTISEDHOST=myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local
+ export ZEEBE_BROKER_CLUSTER_NODEID=2
+ ZEEBE_BROKER_CLUSTER_NODEID=2
+ export ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=3
+ ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=3
+ contactPointPrefix=myzeebe-zeebe
+ contactPoints=
+ [[ -z '' ]]
+ (( i=0 ))
+ (( i<3 ))
++ hostname -d
+ contactPoints=,myzeebe-zeebe-0.myzeebe-zeebe.default.svc.cluster.local:26502
+ (( i++ ))
+ (( i<3 ))
++ hostname -d
+ contactPoints=,myzeebe-zeebe-0.myzeebe-zeebe.default.svc.cluster.local:26502,myzeebe-zeebe-1.myzeebe-zeebe.default.svc.cluster.local:26502
+ (( i++ ))
+ (( i<3 ))
++ hostname -d
+ contactPoints=,myzeebe-zeebe-0.myzeebe-zeebe.default.svc.cluster.local:26502,myzeebe-zeebe-1.myzeebe-zeebe.default.svc.cluster.local:26502,myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local:26502
+ (( i++ ))
+ (( i<3 ))
+ export ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=,myzeebe-zeebe-0.myzeebe-zeebe.default.svc.cluster.local:26502,myzeebe-zeebe-1.myzeebe-zeebe.default.svc.cluster.local:26502,myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local:26502
+ ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=,myzeebe-zeebe-0.myzeebe-zeebe.default.svc.cluster.local:26502,myzeebe-zeebe-1.myzeebe-zeebe.default.svc.cluster.local:26502,myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local:26502
++ ls -A /exporters/
No exporters available.
+ '[' '' ']'
+ echo 'No exporters available.'
+ exec /usr/local/zeebe/bin/broker
Picked up JAVA_TOOL_OPTIONS: -XX:MaxRAMPercentage=25.0 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/zeebe/data -XX:ErrorFile=/usr/local/zeebe/data/zeebe_error%p.log -XX:+ExitOnOutOfMemoryError
2020-10-22 06:25:10,209 main WARN Error while converting string [] to type [class org.apache.logging.log4j.Level]. Using default value [null]. java.lang.IllegalArgumentException: Unknown level constant [].
at org.apache.logging.log4j.Level.valueOf(Level.java:320)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:288)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:284)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:419)
at org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:149)
at org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:258)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:135)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1002)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:942)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:934)
at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:552)
at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:241)
at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:288)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:618)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:691)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:708)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:263)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:153)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:194)
at org.apache.commons.logging.LogAdapter$Log4jLog.<clinit>(LogAdapter.java:155)
at org.apache.commons.logging.LogAdapter$Log4jAdapter.createLog(LogAdapter.java:122)
at org.apache.commons.logging.LogAdapter.createLog(LogAdapter.java:89)
at org.apache.commons.logging.LogFactoryService.getInstance(LogFactoryService.java:46)
at org.apache.commons.logging.LogFactoryService.getInstance(LogFactoryService.java:41)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:655)
at org.springframework.boot.SpringApplication.<clinit>(SpringApplication.java:196)
at io.zeebe.broker.StandaloneBroker.main(StandaloneBroker.java:52)
2020-10-22 06:25:12,225 main WARN Error while converting string [] to type [class org.apache.logging.log4j.Level]. Using default value [null]. java.lang.IllegalArgumentException: Unknown level constant [].
at org.apache.logging.log4j.Level.valueOf(Level.java:320)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:288)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:284)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:419)
at org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:149)
at org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:258)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:135)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1002)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:942)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:934)
at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:552)
at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:241)
at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:288)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:618)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:691)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:708)
at org.springframework.boot.logging.log4j2.Log4J2LoggingSystem.reinitialize(Log4J2LoggingSystem.java:204)
at org.springframework.boot.logging.AbstractLoggingSystem.initializeWithConventions(AbstractLoggingSystem.java:73)
at org.springframework.boot.logging.AbstractLoggingSystem.initialize(AbstractLoggingSystem.java:60)
at org.springframework.boot.logging.log4j2.Log4J2LoggingSystem.initialize(Log4J2LoggingSystem.java:160)
at org.springframework.boot.context.logging.LoggingApplicationListener.initializeSystem(LoggingApplicationListener.java:306)
at org.springframework.boot.context.logging.LoggingApplicationListener.initialize(LoggingApplicationListener.java:281)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationEnvironmentPreparedEvent(LoggingApplicationListener.java:239)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationEvent(LoggingApplicationListener.java:216)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:172)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:165)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:139)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:127)
at org.springframework.boot.context.event.EventPublishingRunListener.environmentPrepared(EventPublishingRunListener.java:80)
at org.springframework.boot.SpringApplicationRunListeners.environmentPrepared(SpringApplicationRunListeners.java:53)
at org.springframework.boot.SpringApplication.prepareEnvironment(SpringApplication.java:345)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:308)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1237)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1226)
at io.zeebe.broker.StandaloneBroker.main(StandaloneBroker.java:52)
______ ______ ______ ____ ______ ____ _____ ____ _ __ ______ _____
|___ / | ____| | ____| | _ \ | ____| | _ \ | __ \ / __ \ | |/ / | ____| | __ \
/ / | |__ | |__ | |_) | | |__ | |_) | | |__) | | | | | | ' / | |__ | |__) |
/ / | __| | __| | _ < | __| | _ < | _ / | | | | | < | __| | _ /
/ /__ | |____ | |____ | |_) | | |____ | |_) | | | \ \ | |__| | | . \ | |____ | | \ \
/_____| |______| |______| |____/ |______| |____/ |_| \_\ \____/ |_|\_\ |______| |_| \_\
2020-10-22 06:25:12.507 [] [main] INFO io.zeebe.broker.StandaloneBroker - Starting StandaloneBroker v0.24.2 on myzeebe-zeebe-2 with PID 6 (/usr/local/zeebe/lib/zeebe-distribution-0.24.2.jar started by root in /usr/local/zeebe)
2020-10-22 06:25:12.517 [] [main] INFO io.zeebe.broker.StandaloneBroker - No active profile set, falling back to default profiles: default
2020-10-22 06:25:17.616 [] [main] INFO org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat initialized with port(s): 9600 (http)
2020-10-22 06:25:17.702 [] [main] INFO org.apache.coyote.http11.Http11NioProtocol - Initializing ProtocolHandler ["http-nio-0.0.0.0-9600"]
2020-10-22 06:25:17.703 [] [main] INFO org.apache.catalina.core.StandardService - Starting service [Tomcat]
2020-10-22 06:25:17.704 [] [main] INFO org.apache.catalina.core.StandardEngine - Starting Servlet engine: [Apache Tomcat/9.0.36]
2020-10-22 06:25:18.029 [] [main] INFO org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring embedded WebApplicationContext
2020-10-22 06:25:18.029 [] [main] INFO org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext - Root WebApplicationContext: initialization completed in 5330 ms
2020-10-22 06:25:19.596 [] [main] INFO org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor - Initializing ExecutorService 'applicationTaskExecutor'
2020-10-22 06:25:20.433 [] [main] INFO org.springframework.boot.actuate.endpoint.web.EndpointLinksResolver - Exposing 2 endpoint(s) beneath base path '/actuator'
2020-10-22 06:25:20.525 [] [main] INFO org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-0.0.0.0-9600"]
2020-10-22 06:25:20.703 [] [main] INFO org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 9600 (http) with context path ''
2020-10-22 06:25:20.736 [] [main] INFO io.zeebe.broker.StandaloneBroker - Started StandaloneBroker in 9.739 seconds (JVM running for 13.814)
2020-10-22 06:25:20.903 [] [main] INFO io.zeebe.broker.system - Version: 0.24.2
2020-10-22 06:25:21.023 [] [main] INFO io.zeebe.broker.system - Starting broker 2 with configuration {
"network" : {
"host" : "0.0.0.0",
"portOffset" : 0,
"maxMessageSize" : "4MB",
"advertisedHost" : "myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local",
"commandApi" : {
"host" : "0.0.0.0",
"port" : 26501,
"advertisedHost" : "myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local",
"advertisedPort" : 26501,
"advertisedAddress" : "myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local:26501",
"address" : "0.0.0.0:26501"
},
"internalApi" : {
"host" : "0.0.0.0",
"port" : 26502,
"advertisedHost" : "myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local",
"advertisedPort" : 26502,
"advertisedAddress" : "myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local:26502",
"address" : "0.0.0.0:26502"
},
"monitoringApi" : {
"host" : "0.0.0.0",
"port" : 9600,
"advertisedHost" : "myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local",
"advertisedPort" : 9600,
"advertisedAddress" : "myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local:9600",
"address" : "0.0.0.0:9600"
},
"maxMessageSizeInBytes" : 4194304
},
"cluster" : {
"initialContactPoints" : [ "myzeebe-zeebe-0.myzeebe-zeebe.default.svc.cluster.local:26502", "myzeebe-zeebe-1.myzeebe-zeebe.default.svc.cluster.local:26502", "myzeebe-zeebe-2.myzeebe-zeebe.default.svc.cluster.local:26502" ],
"partitionIds" : [ 1, 2, 3 ],
"nodeId" : 2,
"partitionsCount" : 3,
"replicationFactor" : 3,
"clusterSize" : 3,
"clusterName" : "myzeebe-zeebe",
"membership" : {
"broadcastUpdates" : false,
"broadcastDisputes" : true,
"notifySuspect" : false,
"gossipInterval" : "PT0.25S",
"gossipFanout" : 2,
"probeInterval" : "PT1S",
"probeTimeout" : "PT2S",
"suspectProbes" : 3,
"failureTimeout" : "PT10S",
"syncInterval" : "PT10S"
}
},
"threads" : {
"cpuThreadCount" : 2,
"ioThreadCount" : 2
},
"data" : {
"directories" : [ "/usr/local/zeebe/data" ],
"logSegmentSize" : "512MB",
"snapshotPeriod" : "PT15M",
"logIndexDensity" : 100,
"logSegmentSizeInBytes" : 536870912,
"atomixStorageLevel" : "DISK"
},
"exporters" : {
"elasticsearch" : {
"jarPath" : null,
"className" : "io.zeebe.exporter.ElasticsearchExporter",
"args" : {
"url" : "http://elasticsearch-master:9200"
},
"external" : false
}
},
"gateway" : {
"network" : {
"host" : "0.0.0.0",
"port" : 26500,
"minKeepAliveInterval" : "PT30S"
},
"cluster" : {
"contactPoint" : "0.0.0.0:26502",
"requestTimeout" : "PT15S",
"clusterName" : "zeebe-cluster",
"memberId" : "gateway",
"host" : "0.0.0.0",
"port" : 26502,
"membership" : {
"broadcastUpdates" : false,
"broadcastDisputes" : true,
"notifySuspect" : false,
"gossipInterval" : "PT0.25S",
"gossipFanout" : 2,
"probeInterval" : "PT1S",
"probeTimeout" : "PT2S",
"suspectProbes" : 3,
"failureTimeout" : "PT10S",
"syncInterval" : "PT10S"
}
},
"threads" : {
"managementThreads" : 1
},
"monitoring" : {
"enabled" : false,
"host" : "0.0.0.0",
"port" : 9600
},
"security" : {
"enabled" : false,
"certificateChainPath" : null,
"privateKeyPath" : null
},
"longPolling" : {
"enabled" : true
},
"initialized" : true,
"enable" : false
},
"backpressure" : {
"enabled" : true,
"algorithm" : "VEGAS",
"aimd" : {
"requestTimeout" : "PT1S",
"initialLimit" : 100,
"minLimit" : 1,
"maxLimit" : 1000,
"backoffRatio" : 0.9
},
"fixedLimit" : {
"limit" : 20
},
"vegas" : {
"alpha" : 3,
"beta" : 6,
"initialLimit" : 20
},
"gradient" : {
"minLimit" : 10,
"initialLimit" : 20,
"rttTolerance" : 2.0
},
"gradient2" : {
"minLimit" : 10,
"initialLimit" : 20,
"rttTolerance" : 2.0,
"longWindow" : 600
}
},
"stepTimeout" : "PT5M",
"executionMetricsExporterEnabled" : false
}
2020-10-22 06:25:21.106 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [1/10]: actor scheduler
2020-10-22 06:25:21.118 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [2/10]: membership and replication protocol
2020-10-22 06:25:27.800 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [3/10]: command api transport
2020-10-22 06:25:28.122 [] [http-nio-0.0.0.0-9600-exec-1] INFO org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring DispatcherServlet 'dispatcherServlet'
2020-10-22 06:25:28.123 [] [http-nio-0.0.0.0-9600-exec-1] INFO org.springframework.web.servlet.DispatcherServlet - Initializing Servlet 'dispatcherServlet'
2020-10-22 06:25:28.195 [] [http-nio-0.0.0.0-9600-exec-1] INFO org.springframework.web.servlet.DispatcherServlet - Completed initialization in 72 ms
2020-10-22 06:25:28.523 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [4/10]: command api handler
2020-10-22 06:25:29.107 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [5/10]: subscription api
2020-10-22 06:25:29.138 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [6/10]: cluster services
2020-10-22 06:25:31.141 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [7/10]: topology manager
2020-10-22 06:25:31.146 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [8/10]: monitoring services
2020-10-22 06:25:31.155 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [9/10]: leader management request handler
2020-10-22 06:25:31.160 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 [10/10]: zeebe partitions
2020-10-22 06:25:31.203 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 partitions [1/3]: partition 3
2020-10-22 06:25:31.703 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 partitions [2/3]: partition 2
2020-10-22 06:25:31.726 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 partitions [3/3]: partition 1
2020-10-22 06:25:31.797 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 partitions succeeded. Started 3 steps in 594 ms.
2020-10-22 06:25:31.797 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-2 succeeded. Started 10 steps in 10692 ms.
@BigeYoung thanks for that.. the describe
command is saying that the readiness probe is failing
Readiness: http-get http://:9600/ready delay=0s timeout=1s period=10s #success=1 #failure=3
and I think that it is due to lack of memory:
Last State: Terminated
Reason: Error
Exit Code: 137
https://sysdig.com/blog/troubleshoot-kubernetes-oom/
Can you try with this?
helm install test-core zeebe/zeebe-full --values https://raw.githubusercontent.com/zeebe-io/zeebe-helm-profiles/master/zeebe-core-team.yaml
Thank you again for your patience. I have followed your request, but the situation has not changed.
➜ ~ kubectl describe pod test-core-zeebe-1
Name: test-core-zeebe-1
Namespace: default
Priority: 0
Node: server4/192.168.137.124
Start Time: Thu, 22 Oct 2020 17:00:56 +0800
Labels: app.kubernetes.io/component=broker
app.kubernetes.io/instance=test-core
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=zeebe
controller-revision-hash=test-core-zeebe-75d69d8d5
statefulset.kubernetes.io/pod-name=test-core-zeebe-1
Annotations: <none>
Status: Running
IP: 10.244.1.156
IPs:
IP: 10.244.1.156
Controlled By: StatefulSet/test-core-zeebe
Containers:
zeebe:
Container ID: docker://bd2262c3ebba9d0e3dcee3e70360c7c2be39ae4b002be7f1f0264d3025754822
Image: camunda/zeebe:0.24.2
Image ID: docker-pullable://camunda/zeebe@sha256:795ace31c498ad4bc37b7b0fab612307c34852f4187766e3f777a509821c9fb3
Ports: 9600/TCP, 26501/TCP, 26502/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Thu, 22 Oct 2020 17:04:53 +0800
Finished: Thu, 22 Oct 2020 17:05:02 +0800
Ready: False
Restart Count: 4
Limits:
cpu: 1
memory: 4Gi
Requests:
cpu: 500m
memory: 2Gi
Readiness: http-get http://:9600/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
ZEEBE_BROKER_CLUSTER_CLUSTERNAME: test-core-zeebe
ZEEBE_LOG_LEVEL:
ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT: 3
ZEEBE_BROKER_CLUSTER_CLUSTERSIZE: 3
ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR: 3
ZEEBE_BROKER_THREADS_CPUTHREADCOUNT: 2
ZEEBE_BROKER_THREADS_IOTHREADCOUNT: 2
ZEEBE_BROKER_GATEWAY_ENABLE: false
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_CLASSNAME: io.zeebe.exporter.ElasticsearchExporter
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_URL: http://elasticsearch-master:9200
ZEEBE_BROKER_NETWORK_COMMANDAPI_PORT: 26501
ZEEBE_BROKER_NETWORK_INTERNALAPI_PORT: 26502
ZEEBE_BROKER_NETWORK_MONITORINGAPI_PORT: 9600
K8S_POD_NAME: test-core-zeebe-1 (v1:metadata.name)
JAVA_TOOL_OPTIONS: -XX:MaxRAMPercentage=25.0 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/zeebe/data -XX:ErrorFile=/usr/local/zeebe/data/zeebe_error%p.log -XX:+ExitOnOutOfMemoryError
Mounts:
/exporters from exporters (rw)
/usr/local/bin/startup.sh from config (rw,path="startup.sh")
/usr/local/zeebe/config/application.yaml from config (rw,path="application.yaml")
/usr/local/zeebe/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-mt74h (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-test-core-zeebe-1
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: test-core-zeebe
Optional: false
exporters:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
default-token-mt74h:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-mt74h
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 5m16s (x2 over 5m17s) default-scheduler 0/4 nodes are available: 4 pod has unbound immediate PersistentVolumeClaims.
Normal Scheduled 5m14s default-scheduler Successfully assigned default/test-core-zeebe-1 to server4
Normal Pulled 2m45s (x4 over 5m12s) kubelet Container image "camunda/zeebe:0.24.2" already present on machine
Normal Created 2m45s (x4 over 5m12s) kubelet Created container zeebe
Normal Started 2m45s (x4 over 5m11s) kubelet Started container zeebe
Warning Unhealthy 2m40s (x5 over 5m10s) kubelet Readiness probe failed: Get "http://10.244.1.156:9600/ready": dial tcp 10.244.1.156:9600: connect: connection refused
Warning Unhealthy 2m29s (x2 over 3m29s) kubelet Readiness probe failed: Get "http://10.244.1.156:9600/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 2m20s (x3 over 4m49s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 1s (x13 over 4m7s) kubelet Back-off restarting failed container
➜ ~ kubectl logs test-core-zeebe-1
++ hostname -f
+ export ZEEBE_BROKER_NETWORK_ADVERTISEDHOST=test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local
+ ZEEBE_BROKER_NETWORK_ADVERTISEDHOST=test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local
+ export ZEEBE_BROKER_CLUSTER_NODEID=1
+ ZEEBE_BROKER_CLUSTER_NODEID=1
+ export ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=3
+ ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=3
+ contactPointPrefix=test-core-zeebe
+ contactPoints=
+ [[ -z '' ]]
+ (( i=0 ))
+ (( i<3 ))
++ hostname -d
+ contactPoints=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502
+ (( i++ ))
+ (( i<3 ))
++ hostname -d
+ contactPoints=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502
+ (( i++ ))
+ (( i<3 ))
++ hostname -d
+ contactPoints=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-2.test-core-zeebe.default.svc.cluster.local:26502
+ (( i++ ))
+ (( i<3 ))
+ export ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-2.test-core-zeebe.default.svc.cluster.local:26502
+ ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-2.test-core-zeebe.default.svc.cluster.local:26502
++ ls -A /exporters/
+ '[' '' ']'
+ echo 'No exporters available.'
+ exec /usr/local/zeebe/bin/broker
No exporters available.
Picked up JAVA_TOOL_OPTIONS: -XX:MaxRAMPercentage=25.0 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/zeebe/data -XX:ErrorFile=/usr/local/zeebe/data/zeebe_error%p.log -XX:+ExitOnOutOfMemoryError
2020-10-22 09:06:27,029 main WARN Error while converting string [] to type [class org.apache.logging.log4j.Level]. Using default value [null]. java.lang.IllegalArgumentException: Unknown level constant [].
at org.apache.logging.log4j.Level.valueOf(Level.java:320)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:288)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:284)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:419)
at org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:149)
at org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:258)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:135)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1002)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:942)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:934)
at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:552)
at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:241)
at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:288)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:618)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:691)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:708)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:263)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:153)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:194)
at org.apache.commons.logging.LogAdapter$Log4jLog.<clinit>(LogAdapter.java:155)
at org.apache.commons.logging.LogAdapter$Log4jAdapter.createLog(LogAdapter.java:122)
at org.apache.commons.logging.LogAdapter.createLog(LogAdapter.java:89)
at org.apache.commons.logging.LogFactoryService.getInstance(LogFactoryService.java:46)
at org.apache.commons.logging.LogFactoryService.getInstance(LogFactoryService.java:41)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:655)
at org.springframework.boot.SpringApplication.<clinit>(SpringApplication.java:196)
at io.zeebe.broker.StandaloneBroker.main(StandaloneBroker.java:52)
2020-10-22 09:06:28,716 main WARN Error while converting string [] to type [class org.apache.logging.log4j.Level]. Using default value [null]. java.lang.IllegalArgumentException: Unknown level constant [].
at org.apache.logging.log4j.Level.valueOf(Level.java:320)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:288)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:284)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:419)
at org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:149)
at org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:258)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:135)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1002)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:942)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:934)
at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:552)
at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:241)
at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:288)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:618)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:691)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:708)
at org.springframework.boot.logging.log4j2.Log4J2LoggingSystem.reinitialize(Log4J2LoggingSystem.java:204)
at org.springframework.boot.logging.AbstractLoggingSystem.initializeWithConventions(AbstractLoggingSystem.java:73)
at org.springframework.boot.logging.AbstractLoggingSystem.initialize(AbstractLoggingSystem.java:60)
at org.springframework.boot.logging.log4j2.Log4J2LoggingSystem.initialize(Log4J2LoggingSystem.java:160)
at org.springframework.boot.context.logging.LoggingApplicationListener.initializeSystem(LoggingApplicationListener.java:306)
at org.springframework.boot.context.logging.LoggingApplicationListener.initialize(LoggingApplicationListener.java:281)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationEnvironmentPreparedEvent(LoggingApplicationListener.java:239)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationEvent(LoggingApplicationListener.java:216)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:172)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:165)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:139)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:127)
at org.springframework.boot.context.event.EventPublishingRunListener.environmentPrepared(EventPublishingRunListener.java:80)
at org.springframework.boot.SpringApplicationRunListeners.environmentPrepared(SpringApplicationRunListeners.java:53)
at org.springframework.boot.SpringApplication.prepareEnvironment(SpringApplication.java:345)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:308)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1237)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1226)
at io.zeebe.broker.StandaloneBroker.main(StandaloneBroker.java:52)
______ ______ ______ ____ ______ ____ _____ ____ _ __ ______ _____
|___ / | ____| | ____| | _ \ | ____| | _ \ | __ \ / __ \ | |/ / | ____| | __ \
/ / | |__ | |__ | |_) | | |__ | |_) | | |__) | | | | | | ' / | |__ | |__) |
/ / | __| | __| | _ < | __| | _ < | _ / | | | | | < | __| | _ /
/ /__ | |____ | |____ | |_) | | |____ | |_) | | | \ \ | |__| | | . \ | |____ | | \ \
/_____| |______| |______| |____/ |______| |____/ |_| \_\ \____/ |_|\_\ |______| |_| \_\
2020-10-22 09:06:28.996 [] [main] INFO io.zeebe.broker.StandaloneBroker - Starting StandaloneBroker v0.24.2 on test-core-zeebe-1 with PID 6 (/usr/local/zeebe/lib/zeebe-distribution-0.24.2.jar started by root in /usr/local/zeebe)
2020-10-22 09:06:29.008 [] [main] INFO io.zeebe.broker.StandaloneBroker - No active profile set, falling back to default profiles: default
2020-10-22 09:06:33.700 [] [main] INFO org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat initialized with port(s): 9600 (http)
2020-10-22 09:06:33.733 [] [main] INFO org.apache.coyote.http11.Http11NioProtocol - Initializing ProtocolHandler ["http-nio-0.0.0.0-9600"]
2020-10-22 09:06:33.735 [] [main] INFO org.apache.catalina.core.StandardService - Starting service [Tomcat]
2020-10-22 09:06:33.737 [] [main] INFO org.apache.catalina.core.StandardEngine - Starting Servlet engine: [Apache Tomcat/9.0.36]
2020-10-22 09:06:34.100 [] [main] INFO org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring embedded WebApplicationContext
2020-10-22 09:06:34.100 [] [main] INFO org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext - Root WebApplicationContext: initialization completed in 4903 ms
2020-10-22 09:06:35.526 [] [main] INFO org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor - Initializing ExecutorService 'applicationTaskExecutor'
2020-10-22 09:06:36.531 [] [main] INFO org.springframework.boot.actuate.endpoint.web.EndpointLinksResolver - Exposing 2 endpoint(s) beneath base path '/actuator'
2020-10-22 09:06:36.623 [] [main] INFO org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-0.0.0.0-9600"]
2020-10-22 09:06:36.802 [] [main] INFO org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 9600 (http) with context path ''
2020-10-22 09:06:36.831 [] [main] INFO io.zeebe.broker.StandaloneBroker - Started StandaloneBroker in 9.131 seconds (JVM running for 12.893)
2020-10-22 09:06:37.014 [] [main] INFO io.zeebe.broker.system - Version: 0.24.2
2020-10-22 09:06:37.211 [] [main] INFO io.zeebe.broker.system - Starting broker 1 with configuration {
"network" : {
"host" : "0.0.0.0",
"portOffset" : 0,
"maxMessageSize" : "4MB",
"advertisedHost" : "test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local",
"commandApi" : {
"host" : "0.0.0.0",
"port" : 26501,
"advertisedHost" : "test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local",
"advertisedPort" : 26501,
"advertisedAddress" : "test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26501",
"address" : "0.0.0.0:26501"
},
"internalApi" : {
"host" : "0.0.0.0",
"port" : 26502,
"advertisedHost" : "test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local",
"advertisedPort" : 26502,
"advertisedAddress" : "test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502",
"address" : "0.0.0.0:26502"
},
"monitoringApi" : {
"host" : "0.0.0.0",
"port" : 9600,
"advertisedHost" : "test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local",
"advertisedPort" : 9600,
"advertisedAddress" : "test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:9600",
"address" : "0.0.0.0:9600"
},
"maxMessageSizeInBytes" : 4194304
},
"cluster" : {
"initialContactPoints" : [ "test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502", "test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502", "test-core-zeebe-2.test-core-zeebe.default.svc.cluster.local:26502" ],
"partitionIds" : [ 1, 2, 3 ],
"nodeId" : 1,
"partitionsCount" : 3,
"replicationFactor" : 3,
"clusterSize" : 3,
"clusterName" : "test-core-zeebe",
"membership" : {
"broadcastUpdates" : false,
"broadcastDisputes" : true,
"notifySuspect" : false,
"gossipInterval" : "PT0.25S",
"gossipFanout" : 2,
"probeInterval" : "PT1S",
"probeTimeout" : "PT2S",
"suspectProbes" : 3,
"failureTimeout" : "PT10S",
"syncInterval" : "PT10S"
}
},
"threads" : {
"cpuThreadCount" : 2,
"ioThreadCount" : 2
},
"data" : {
"directories" : [ "/usr/local/zeebe/data" ],
"logSegmentSize" : "512MB",
"snapshotPeriod" : "PT15M",
"logIndexDensity" : 100,
"logSegmentSizeInBytes" : 536870912,
"atomixStorageLevel" : "DISK"
},
"exporters" : {
"elasticsearch" : {
"jarPath" : null,
"className" : "io.zeebe.exporter.ElasticsearchExporter",
"args" : {
"url" : "http://elasticsearch-master:9200"
},
"external" : false
}
},
"gateway" : {
"network" : {
"host" : "0.0.0.0",
"port" : 26500,
"minKeepAliveInterval" : "PT30S"
},
"cluster" : {
"contactPoint" : "0.0.0.0:26502",
"requestTimeout" : "PT15S",
"clusterName" : "zeebe-cluster",
"memberId" : "gateway",
"host" : "0.0.0.0",
"port" : 26502,
"membership" : {
"broadcastUpdates" : false,
"broadcastDisputes" : true,
"notifySuspect" : false,
"gossipInterval" : "PT0.25S",
"gossipFanout" : 2,
"probeInterval" : "PT1S",
"probeTimeout" : "PT2S",
"suspectProbes" : 3,
"failureTimeout" : "PT10S",
"syncInterval" : "PT10S"
}
},
"threads" : {
"managementThreads" : 1
},
"monitoring" : {
"enabled" : false,
"host" : "0.0.0.0",
"port" : 9600
},
"security" : {
"enabled" : false,
"certificateChainPath" : null,
"privateKeyPath" : null
},
"longPolling" : {
"enabled" : true
},
"initialized" : true,
"enable" : false
},
"backpressure" : {
"enabled" : true,
"algorithm" : "VEGAS",
"aimd" : {
"requestTimeout" : "PT1S",
"initialLimit" : 100,
"minLimit" : 1,
"maxLimit" : 1000,
"backoffRatio" : 0.9
},
"fixedLimit" : {
"limit" : 20
},
"vegas" : {
"alpha" : 3,
"beta" : 6,
"initialLimit" : 20
},
"gradient" : {
"minLimit" : 10,
"initialLimit" : 20,
"rttTolerance" : 2.0
},
"gradient2" : {
"minLimit" : 10,
"initialLimit" : 20,
"rttTolerance" : 2.0,
"longWindow" : 600
}
},
"stepTimeout" : "PT5M",
"executionMetricsExporterEnabled" : false
}
2020-10-22 09:06:37.297 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [1/10]: actor scheduler
2020-10-22 09:06:37.303 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [2/10]: membership and replication protocol
2020-10-22 09:06:41.308 [] [http-nio-0.0.0.0-9600-exec-1] INFO org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring DispatcherServlet 'dispatcherServlet'
2020-10-22 09:06:41.309 [] [http-nio-0.0.0.0-9600-exec-1] INFO org.springframework.web.servlet.DispatcherServlet - Initializing Servlet 'dispatcherServlet'
2020-10-22 09:06:41.319 [] [http-nio-0.0.0.0-9600-exec-1] INFO org.springframework.web.servlet.DispatcherServlet - Completed initialization in 10 ms
2020-10-22 09:06:44.826 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [3/10]: command api transport
2020-10-22 09:06:45.315 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [4/10]: command api handler
2020-10-22 09:06:45.727 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [5/10]: subscription api
2020-10-22 09:06:45.806 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [6/10]: cluster services
2020-10-22 09:06:47.977 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [7/10]: topology manager
2020-10-22 09:06:47.979 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [8/10]: monitoring services
2020-10-22 09:06:47.984 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [9/10]: leader management request handler
2020-10-22 09:06:47.990 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 [10/10]: zeebe partitions
2020-10-22 09:06:47.993 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 partitions [1/3]: partition 3
2020-10-22 09:06:48.402 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 partitions [2/3]: partition 2
2020-10-22 09:06:48.427 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 partitions [3/3]: partition 1
2020-10-22 09:06:48.444 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 partitions succeeded. Started 3 steps in 451 ms.
2020-10-22 09:06:48.447 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-1 succeeded. Started 10 steps in 11151 ms.
Also, since you mentioned the memory problem, I entered server4 and ran top
.
@BigeYoung notice that Kubernetes might have memory available but if the requests and limits are requesting less of what the application inside the container needs, it will kill it anyways:
Limits:
cpu: 1
memory: 4Gi
Requests:
cpu: 500m
memory: 2Gi
These needs to be different if you run with:
helm install test-core zeebe/zeebe-full --values https://raw.githubusercontent.com/zeebe-io/zeebe-helm-profiles/master/zeebe-core-team.yaml
Notice that you can tweak the values if you download that file from the github repo:
https://raw.githubusercontent.com/zeebe-io/zeebe-helm-profiles/master/zeebe-core-team.yaml
Notice again, the requests
and limits
sections
Notice that the file that I've pointed out is adding loads of memory and CPUs
limits:
cpu: 5
memory: 12Gi
requests:
cpu: 5
memory: 12Gi
You might need to tweak that to fit in your cluster resources
Oh, it is super strange. I downloaded this yaml file, but no matter how I modify the value of resources
, the pod limit will not change after helm install
.
I guess that the file is not correct then.. you can also change those values by using helm --set
can you try that?
I found that it was because zeebe/zeebe-full
did not accept this value file, so I used chart zeebe-cluster
instead. My complete installation command is:
helm install test-core zeebe/zeebe-cluster --values zeebe-core-team.yaml
And I changed the resource limits to:
# RESOURCES
resources:
limits:
cpu: 2
memory: 6Gi
requests:
cpu: 1
memory: 2Gi
I am very happy to see that resource limits have changed, but unfortunately, the crash continues.
➜ ~ kubectl describe pod test-core-zeebe-0
Name: test-core-zeebe-0
Namespace: default
Priority: 0
Node: server4/192.168.137.124
Start Time: Fri, 23 Oct 2020 14:25:38 +0800
Labels: app.kubernetes.io/component=broker
app.kubernetes.io/instance=test-core
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=zeebe-cluster
controller-revision-hash=test-core-zeebe-87b8975cd
statefulset.kubernetes.io/pod-name=test-core-zeebe-0
Annotations: <none>
Status: Running
IP: 10.244.1.242
IPs:
IP: 10.244.1.242
Controlled By: StatefulSet/test-core-zeebe
Containers:
zeebe-cluster:
Container ID: docker://b75bb59b8c8b94509f954eb729b129a1bee65c458b3b0a77a0187502fc73660a
Image: camunda/zeebe:0.24.2
Image ID: docker-pullable://camunda/zeebe@sha256:795ace31c498ad4bc37b7b0fab612307c34852f4187766e3f777a509821c9fb3
Ports: 9600/TCP, 26501/TCP, 26502/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Fri, 23 Oct 2020 14:46:03 +0800
Finished: Fri, 23 Oct 2020 14:47:01 +0800
Ready: False
Restart Count: 8
Limits:
cpu: 2
memory: 6Gi
Requests:
cpu: 1
memory: 2Gi
Readiness: http-get http://:9600/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
ZEEBE_BROKER_CLUSTER_CLUSTERNAME: test-core-zeebe
ZEEBE_LOG_LEVEL:
ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT: 3
ZEEBE_BROKER_CLUSTER_CLUSTERSIZE: 3
ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR: 3
ZEEBE_BROKER_THREADS_CPUTHREADCOUNT: 2
ZEEBE_BROKER_THREADS_IOTHREADCOUNT: 2
ZEEBE_BROKER_GATEWAY_ENABLE: false
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_CLASSNAME: io.zeebe.exporter.ElasticsearchExporter
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_URL: http://elasticsearch-master:9200
ZEEBE_BROKER_NETWORK_COMMANDAPI_PORT: 26501
ZEEBE_BROKER_NETWORK_INTERNALAPI_PORT: 26502
ZEEBE_BROKER_NETWORK_MONITORINGAPI_PORT: 9600
K8S_POD_NAME: test-core-zeebe-0 (v1:metadata.name)
JAVA_TOOL_OPTIONS: -XX:MaxRAMPercentage=25.0 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/zeebe/data -XX:ErrorFile=/usr/local/zeebe/data/zeebe_error%p.log -XX:+ExitOnOutOfMemoryError
Mounts:
/exporters from exporters (rw)
/usr/local/bin/startup.sh from config (rw,path="startup.sh")
/usr/local/zeebe/config/application.yaml from config (rw,path="application.yaml")
/usr/local/zeebe/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-mt74h (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-test-core-zeebe-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: test-core-zeebe-cluster
Optional: false
exporters:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
default-token-mt74h:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-mt74h
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned default/test-core-zeebe-0 to server4
Normal Pulled 22m (x4 over 25m) kubelet Container image "camunda/zeebe:0.24.2" already present on machine
Normal Created 22m (x4 over 25m) kubelet Created container zeebe-cluster
Normal Started 22m (x4 over 25m) kubelet Started container zeebe-cluster
Warning Unhealthy 22m (x4 over 25m) kubelet Readiness probe failed: Get "http://10.244.1.240:9600/ready": dial tcp 10.244.1.240:9600: connect: connection refused
Warning Unhealthy 22m (x4 over 25m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 21m kubelet Readiness probe failed: Get "http://10.244.1.240:9600/ready": read tcp 10.244.1.1:50878->10.244.1.240:9600: read: connection reset by peer
Warning BackOff 8s (x91 over 23m) kubelet Back-off restarting failed container
++ hostname -f
+ export ZEEBE_BROKER_NETWORK_ADVERTISEDHOST=test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local
+ ZEEBE_BROKER_NETWORK_ADVERTISEDHOST=test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local
+ export ZEEBE_BROKER_CLUSTER_NODEID=0
+ ZEEBE_BROKER_CLUSTER_NODEID=0
+ export ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=3
+ ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=3
+ contactPointPrefix=test-core-zeebe
+ contactPoints=
+ [[ -z '' ]]
+ (( i=0 ))
+ (( i<3 ))
++ hostname -d
+ contactPoints=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502
+ (( i++ ))
+ (( i<3 ))
++ hostname -d
+ contactPoints=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502
+ (( i++ ))
+ (( i<3 ))
++ hostname -d
+ contactPoints=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-2.test-core-zeebe.default.svc.cluster.local:26502
+ (( i++ ))
+ (( i<3 ))
+ export ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-2.test-core-zeebe.default.svc.cluster.local:26502
+ ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=,test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502,test-core-zeebe-2.test-core-zeebe.default.svc.cluster.local:26502
++ ls -A /exporters/
+ '[' '' ']'
No exporters available.
+ echo 'No exporters available.'
+ exec /usr/local/zeebe/bin/broker
Picked up JAVA_TOOL_OPTIONS: -XX:MaxRAMPercentage=25.0 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/zeebe/data -XX:ErrorFile=/usr/local/zeebe/data/zeebe_error%p.log -XX:+ExitOnOutOfMemoryError
2020-10-23 06:46:05,618 main WARN Error while converting string [] to type [class org.apache.logging.log4j.Level]. Using default value [null]. java.lang.IllegalArgumentException: Unknown level constant [].
at org.apache.logging.log4j.Level.valueOf(Level.java:320)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:288)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:284)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:419)
at org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:149)
at org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:258)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:135)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1002)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:942)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:934)
at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:552)
at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:241)
at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:288)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:618)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:691)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:708)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:263)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:153)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:194)
at org.apache.commons.logging.LogAdapter$Log4jLog.<clinit>(LogAdapter.java:155)
at org.apache.commons.logging.LogAdapter$Log4jAdapter.createLog(LogAdapter.java:122)
at org.apache.commons.logging.LogAdapter.createLog(LogAdapter.java:89)
at org.apache.commons.logging.LogFactoryService.getInstance(LogFactoryService.java:46)
at org.apache.commons.logging.LogFactoryService.getInstance(LogFactoryService.java:41)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:655)
at org.springframework.boot.SpringApplication.<clinit>(SpringApplication.java:196)
at io.zeebe.broker.StandaloneBroker.main(StandaloneBroker.java:52)
2020-10-23 06:46:06,540 main WARN Error while converting string [] to type [class org.apache.logging.log4j.Level]. Using default value [null]. java.lang.IllegalArgumentException: Unknown level constant [].
at org.apache.logging.log4j.Level.valueOf(Level.java:320)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:288)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:284)
at org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:419)
at org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:149)
at org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:258)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:135)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1002)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:942)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:934)
at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:552)
at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:241)
at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:288)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:618)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:691)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:708)
at org.springframework.boot.logging.log4j2.Log4J2LoggingSystem.reinitialize(Log4J2LoggingSystem.java:204)
at org.springframework.boot.logging.AbstractLoggingSystem.initializeWithConventions(AbstractLoggingSystem.java:73)
at org.springframework.boot.logging.AbstractLoggingSystem.initialize(AbstractLoggingSystem.java:60)
at org.springframework.boot.logging.log4j2.Log4J2LoggingSystem.initialize(Log4J2LoggingSystem.java:160)
at org.springframework.boot.context.logging.LoggingApplicationListener.initializeSystem(LoggingApplicationListener.java:306)
at org.springframework.boot.context.logging.LoggingApplicationListener.initialize(LoggingApplicationListener.java:281)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationEnvironmentPreparedEvent(LoggingApplicationListener.java:239)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationEvent(LoggingApplicationListener.java:216)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:172)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:165)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:139)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:127)
at org.springframework.boot.context.event.EventPublishingRunListener.environmentPrepared(EventPublishingRunListener.java:80)
at org.springframework.boot.SpringApplicationRunListeners.environmentPrepared(SpringApplicationRunListeners.java:53)
at org.springframework.boot.SpringApplication.prepareEnvironment(SpringApplication.java:345)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:308)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1237)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1226)
at io.zeebe.broker.StandaloneBroker.main(StandaloneBroker.java:52)
______ ______ ______ ____ ______ ____ _____ ____ _ __ ______ _____
|___ / | ____| | ____| | _ \ | ____| | _ \ | __ \ / __ \ | |/ / | ____| | __ \
/ / | |__ | |__ | |_) | | |__ | |_) | | |__) | | | | | | ' / | |__ | |__) |
/ / | __| | __| | _ < | __| | _ < | _ / | | | | | < | __| | _ /
/ /__ | |____ | |____ | |_) | | |____ | |_) | | | \ \ | |__| | | . \ | |____ | | \ \
/_____| |______| |______| |____/ |______| |____/ |_| \_\ \____/ |_|\_\ |______| |_| \_\
2020-10-23 06:46:06.744 [] [main] INFO io.zeebe.broker.StandaloneBroker - Starting StandaloneBroker v0.24.2 on test-core-zeebe-0 with PID 6 (/usr/local/zeebe/lib/zeebe-distribution-0.24.2.jar started by root in /usr/local/zeebe)
2020-10-23 06:46:06.758 [] [main] INFO io.zeebe.broker.StandaloneBroker - No active profile set, falling back to default profiles: default
2020-10-23 06:46:09.460 [] [main] INFO org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat initialized with port(s): 9600 (http)
2020-10-23 06:46:09.492 [] [main] INFO org.apache.coyote.http11.Http11NioProtocol - Initializing ProtocolHandler ["http-nio-0.0.0.0-9600"]
2020-10-23 06:46:09.494 [] [main] INFO org.apache.catalina.core.StandardService - Starting service [Tomcat]
2020-10-23 06:46:09.494 [] [main] INFO org.apache.catalina.core.StandardEngine - Starting Servlet engine: [Apache Tomcat/9.0.36]
2020-10-23 06:46:09.652 [] [main] INFO org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring embedded WebApplicationContext
2020-10-23 06:46:09.653 [] [main] INFO org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext - Root WebApplicationContext: initialization completed in 2791 ms
2020-10-23 06:46:10.225 [] [main] INFO org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor - Initializing ExecutorService 'applicationTaskExecutor'
2020-10-23 06:46:10.680 [] [main] INFO org.springframework.boot.actuate.endpoint.web.EndpointLinksResolver - Exposing 2 endpoint(s) beneath base path '/actuator'
2020-10-23 06:46:10.728 [] [main] INFO org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-0.0.0.0-9600"]
2020-10-23 06:46:10.765 [] [main] INFO org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 9600 (http) with context path ''
2020-10-23 06:46:10.785 [] [main] INFO io.zeebe.broker.StandaloneBroker - Started StandaloneBroker in 4.854 seconds (JVM running for 6.941)
2020-10-23 06:46:10.912 [] [main] INFO io.zeebe.broker.system - Version: 0.24.2
2020-10-23 06:46:10.997 [] [main] INFO io.zeebe.broker.system - Starting broker 0 with configuration {
"network" : {
"host" : "0.0.0.0",
"portOffset" : 0,
"maxMessageSize" : "4MB",
"advertisedHost" : "test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local",
"commandApi" : {
"host" : "0.0.0.0",
"port" : 26501,
"advertisedHost" : "test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local",
"advertisedPort" : 26501,
"address" : "0.0.0.0:26501",
"advertisedAddress" : "test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26501"
},
"internalApi" : {
"host" : "0.0.0.0",
"port" : 26502,
"advertisedHost" : "test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local",
"advertisedPort" : 26502,
"address" : "0.0.0.0:26502",
"advertisedAddress" : "test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502"
},
"monitoringApi" : {
"host" : "0.0.0.0",
"port" : 9600,
"advertisedHost" : "test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local",
"advertisedPort" : 9600,
"address" : "0.0.0.0:9600",
"advertisedAddress" : "test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:9600"
},
"maxMessageSizeInBytes" : 4194304
},
"cluster" : {
"initialContactPoints" : [ "test-core-zeebe-0.test-core-zeebe.default.svc.cluster.local:26502", "test-core-zeebe-1.test-core-zeebe.default.svc.cluster.local:26502", "test-core-zeebe-2.test-core-zeebe.default.svc.cluster.local:26502" ],
"partitionIds" : [ 1, 2, 3 ],
"nodeId" : 0,
"partitionsCount" : 3,
"replicationFactor" : 3,
"clusterSize" : 3,
"clusterName" : "test-core-zeebe",
"membership" : {
"broadcastUpdates" : false,
"broadcastDisputes" : true,
"notifySuspect" : false,
"gossipInterval" : "PT0.25S",
"gossipFanout" : 2,
"probeInterval" : "PT1S",
"probeTimeout" : "PT2S",
"suspectProbes" : 3,
"failureTimeout" : "PT10S",
"syncInterval" : "PT10S"
}
},
"threads" : {
"cpuThreadCount" : 2,
"ioThreadCount" : 2
},
"data" : {
"directories" : [ "/usr/local/zeebe/data" ],
"logSegmentSize" : "512MB",
"snapshotPeriod" : "PT15M",
"logIndexDensity" : 100,
"logSegmentSizeInBytes" : 536870912,
"atomixStorageLevel" : "DISK"
},
"exporters" : {
"elasticsearch" : {
"jarPath" : null,
"className" : "io.zeebe.exporter.ElasticsearchExporter",
"args" : {
"url" : "http://elasticsearch-master:9200"
},
"external" : false
}
},
"gateway" : {
"network" : {
"host" : "0.0.0.0",
"port" : 26500,
"minKeepAliveInterval" : "PT30S"
},
"cluster" : {
"contactPoint" : "0.0.0.0:26502",
"requestTimeout" : "PT15S",
"clusterName" : "zeebe-cluster",
"memberId" : "gateway",
"host" : "0.0.0.0",
"port" : 26502,
"membership" : {
"broadcastUpdates" : false,
"broadcastDisputes" : true,
"notifySuspect" : false,
"gossipInterval" : "PT0.25S",
"gossipFanout" : 2,
"probeInterval" : "PT1S",
"probeTimeout" : "PT2S",
"suspectProbes" : 3,
"failureTimeout" : "PT10S",
"syncInterval" : "PT10S"
}
},
"threads" : {
"managementThreads" : 1
},
"monitoring" : {
"enabled" : false,
"host" : "0.0.0.0",
"port" : 9600
},
"security" : {
"enabled" : false,
"certificateChainPath" : null,
"privateKeyPath" : null
},
"longPolling" : {
"enabled" : true
},
"initialized" : true,
"enable" : false
},
"backpressure" : {
"enabled" : true,
"algorithm" : "VEGAS",
"aimd" : {
"requestTimeout" : "PT1S",
"initialLimit" : 100,
"minLimit" : 1,
"maxLimit" : 1000,
"backoffRatio" : 0.9
},
"fixedLimit" : {
"limit" : 20
},
"vegas" : {
"alpha" : 3,
"beta" : 6,
"initialLimit" : 20
},
"gradient" : {
"minLimit" : 10,
"initialLimit" : 20,
"rttTolerance" : 2.0
},
"gradient2" : {
"minLimit" : 10,
"initialLimit" : 20,
"rttTolerance" : 2.0,
"longWindow" : 600
}
},
"stepTimeout" : "PT5M",
"executionMetricsExporterEnabled" : false
}
2020-10-23 06:46:11.025 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [1/10]: actor scheduler
2020-10-23 06:46:11.028 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [2/10]: membership and replication protocol
2020-10-23 06:46:12.014 [] [http-nio-0.0.0.0-9600-exec-1] INFO org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring DispatcherServlet 'dispatcherServlet'
2020-10-23 06:46:12.020 [] [http-nio-0.0.0.0-9600-exec-1] INFO org.springframework.web.servlet.DispatcherServlet - Initializing Servlet 'dispatcherServlet'
2020-10-23 06:46:12.051 [] [http-nio-0.0.0.0-9600-exec-1] INFO org.springframework.web.servlet.DispatcherServlet - Completed initialization in 30 ms
2020-10-23 06:46:14.746 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [3/10]: command api transport
2020-10-23 06:46:15.002 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [4/10]: command api handler
2020-10-23 06:46:15.053 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [5/10]: subscription api
2020-10-23 06:46:15.097 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [6/10]: cluster services
2020-10-23 06:46:16.736 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [7/10]: topology manager
2020-10-23 06:46:16.737 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [8/10]: monitoring services
2020-10-23 06:46:16.742 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [9/10]: leader management request handler
2020-10-23 06:46:16.746 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [10/10]: zeebe partitions
2020-10-23 06:46:16.750 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 partitions [1/3]: partition 3
2020-10-23 06:46:17.027 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 partitions [2/3]: partition 2
2020-10-23 06:46:17.046 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 partitions [3/3]: partition 1
2020-10-23 06:46:17.066 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 partitions succeeded. Started 3 steps in 316 ms.
2020-10-23 06:46:17.067 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 succeeded. Started 10 steps in 6044 ms.
@BigeYoung thanks for trying all these out and it was my mistake on the yaml file, you can change that file to be used with the full chart but I will manage to update that file so it doesn't cause problems.
I notice that you still are requesting 2gb
of memory, so even if the limits are higher the JVM might be chocking up on memory (hence you still getting OOM with error 137). Can you try
resources:
limits:
cpu: 2
memory: 6Gi
requests:
cpu: 1
memory: 3Gi
? I will create a cluster today on GKE to double check and see if I can reproduce your error. I just need to finish some stuff first
@BigeYoung can you share your cluster details? Kubernetes version and size (type of nodes and amount)? That is something that might help me to replicate the issue faster, to make sure that we have similar setups.
I have 4 nodes, named server1, 2, 3, 4. All running CentOS 7 and Kubernetes v1.19.2. Among them, server1 is set as the master node. server1, 2, 3 are all Dell PowerEdge R240, CPUs are 4-core Intel(R) Xeon(R) E-2224 CPU @ 3.40GHz, both of them have 32GB memory and 2TB hard drive. server4 is an Inspur server with 16GB memory, 2TB hard disk, and 8 Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz.
➜ ~ kubectl describe nodes
Name: server1
Roles: master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=server1
kubernetes.io/os=linux
node-role.kubernetes.io/master=
Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":"de:34:67:a5:c3:e4"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 116.57.98.121
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Sun, 27 Sep 2020 11:32:48 +0800
Taints: node-role.kubernetes.io/master:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: server1
AcquireTime: <unset>
RenewTime: Sat, 24 Oct 2020 15:44:19 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Sun, 18 Oct 2020 21:21:21 +0800 Sun, 18 Oct 2020 21:21:21 +0800 FlannelIsUp Flannel is running on this node
MemoryPressure False Sat, 24 Oct 2020 15:43:59 +0800 Sun, 27 Sep 2020 11:32:46 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sat, 24 Oct 2020 15:43:59 +0800 Sun, 27 Sep 2020 11:32:46 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Sat, 24 Oct 2020 15:43:59 +0800 Sun, 27 Sep 2020 11:32:46 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sat, 24 Oct 2020 15:43:59 +0800 Sun, 27 Sep 2020 14:40:54 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.137.121
Hostname: server1
Capacity:
cpu: 4
ephemeral-storage: 51175Mi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32623028Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 48294789041
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32520628Ki
pods: 110
System Info:
Machine ID: 00ceaf5e24814d1cacd6469737795c28
System UUID: 4C4C4544-005A-3910-8043-B9C04F313433
Boot ID: 498a7d84-cd76-46e9-bac1-e90ace4974d2
Kernel Version: 3.10.0-1127.19.1.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.13
Kubelet Version: v1.19.2
Kube-Proxy Version: v1.19.2
PodCIDR: 10.244.0.0/24
PodCIDRs: 10.244.0.0/24
Non-terminated Pods: (6 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system etcd-server1 0 (0%) 0 (0%) 0 (0%) 0 (0%) 27d
kube-system kube-apiserver-server1 250m (6%) 0 (0%) 0 (0%) 0 (0%) 27d
kube-system kube-controller-manager-server1 200m (5%) 0 (0%) 0 (0%) 0 (0%) 27d
kube-system kube-flannel-ds-hz74p 100m (2%) 100m (2%) 50Mi (0%) 50Mi (0%) 27d
kube-system kube-proxy-mb4bm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 27d
kube-system kube-scheduler-server1 100m (2%) 0 (0%) 0 (0%) 0 (0%) 27d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 650m (16%) 100m (2%)
memory 50Mi (0%) 50Mi (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
Name: server2
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=server2
kubernetes.io/os=linux
Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":"da:18:2a:3c:2c:62"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 116.57.98.122
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Sun, 27 Sep 2020 11:35:05 +0800
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: server2
AcquireTime: <unset>
RenewTime: Sat, 24 Oct 2020 15:44:26 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Sun, 27 Sep 2020 19:33:47 +0800 Sun, 27 Sep 2020 19:33:47 +0800 FlannelIsUp Flannel is running on this node
MemoryPressure False Sat, 24 Oct 2020 15:40:07 +0800 Sun, 27 Sep 2020 19:33:30 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sat, 24 Oct 2020 15:40:07 +0800 Sun, 27 Sep 2020 19:33:30 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Sat, 24 Oct 2020 15:40:07 +0800 Sun, 27 Sep 2020 19:33:30 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sat, 24 Oct 2020 15:40:07 +0800 Sun, 27 Sep 2020 19:33:30 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.137.122
Hostname: server2
Capacity:
cpu: 4
ephemeral-storage: 51175Mi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32623028Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 48294789041
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32520628Ki
pods: 110
System Info:
Machine ID: 67e1b7e8e01544c9a79128266c5a155d
System UUID: 4C4C4544-0043-3010-8042-B1C04F313433
Boot ID: 53219857-228a-45d8-b711-db8e0b967b10
Kernel Version: 3.10.0-1127.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.13
Kubelet Version: v1.19.2
Kube-Proxy Version: v1.19.2
PodCIDR: 10.244.2.0/24
PodCIDRs: 10.244.2.0/24
Non-terminated Pods: (8 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default consul-consul-server-1 100m (2%) 100m (2%) 100Mi (0%) 100Mi (0%) 6d2h
default consul-consul-zcsjv 100m (2%) 100m (2%) 100Mi (0%) 100Mi (0%) 6d2h
default cpps-product-client-1-9cccddfc6-w5zjl 0 (0%) 0 (0%) 0 (0%) 0 (0%) 18d
default cpps-product-client-3-6c87b55479-jbkz8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 17d
default mqtt-mosquitto-5f6dc7c898-7vq9j 0 (0%) 0 (0%) 0 (0%) 0 (0%) 27d
kube-system coredns-f9fd979d6-ktpcj 100m (2%) 0 (0%) 70Mi (0%) 170Mi (0%) 27d
kube-system kube-flannel-ds-4ff2d 100m (2%) 100m (2%) 50Mi (0%) 50Mi (0%) 27d
kube-system kube-proxy-l865g 0 (0%) 0 (0%) 0 (0%) 0 (0%) 27d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 400m (10%) 300m (7%)
memory 320Mi (1%) 420Mi (1%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
Name: server3
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=server3
kubernetes.io/os=linux
Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":"1e:8c:80:2b:ed:bb"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 116.57.98.123
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Sun, 27 Sep 2020 11:35:10 +0800
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: server3
AcquireTime: <unset>
RenewTime: Sat, 24 Oct 2020 15:44:28 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Mon, 28 Sep 2020 14:04:39 +0800 Mon, 28 Sep 2020 14:04:39 +0800 FlannelIsUp Flannel is running on this node
MemoryPressure False Sat, 24 Oct 2020 15:41:20 +0800 Mon, 28 Sep 2020 14:04:26 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sat, 24 Oct 2020 15:41:20 +0800 Mon, 28 Sep 2020 14:04:26 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Sat, 24 Oct 2020 15:41:20 +0800 Mon, 28 Sep 2020 14:04:26 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sat, 24 Oct 2020 15:41:20 +0800 Mon, 28 Sep 2020 14:04:26 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.137.123
Hostname: server3
Capacity:
cpu: 4
ephemeral-storage: 51175Mi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32623028Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 48294789041
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32520628Ki
pods: 110
System Info:
Machine ID: 49a477c8d4f1499ab9141a053ea40bca
System UUID: 4C4C4544-005A-3710-8048-B9C04F313433
Boot ID: bdd799e9-ff66-4ba8-8b21-450de318541c
Kernel Version: 3.10.0-1127.19.1.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.13
Kubelet Version: v1.19.2
Kube-Proxy Version: v1.19.2
PodCIDR: 10.244.3.0/24
PodCIDRs: 10.244.3.0/24
Non-terminated Pods: (10 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default consul-consul-connect-injector-webhook-deployment-5c589b88xp22t 50m (1%) 50m (1%) 50Mi (0%) 50Mi (0%) 6d2h
default consul-consul-server-0 100m (2%) 100m (2%) 100Mi (0%) 100Mi (0%) 6d2h
default consul-consul-sync-catalog-68b75f5cf-27zwk 50m (1%) 50m (1%) 50Mi (0%) 50Mi (0%) 6d2h
default consul-consul-w97pz 100m (2%) 100m (2%) 100Mi (0%) 100Mi (0%) 6d2h
default mariadb-0 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d17h
default minio-7cffb45794-gr4h5 0 (0%) 0 (0%) 4Gi (12%) 0 (0%) 26d
default minio-zeebe-bd556b68f-mmbn5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 26d
kube-system kube-flannel-ds-6s9nl 100m (2%) 100m (2%) 50Mi (0%) 50Mi (0%) 27d
kube-system kube-proxy-zj7fh 0 (0%) 0 (0%) 0 (0%) 0 (0%) 27d
ua-client ua-client-launcher-7b67f56558-h4s72 0 (0%) 0 (0%) 0 (0%) 0 (0%) 19d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 400m (10%) 400m (10%)
memory 4446Mi (13%) 350Mi (1%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
Name: server4
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=server4
kubernetes.io/os=linux
Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":"86:0d:d9:f4:f7:3d"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 116.57.98.124
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Sun, 27 Sep 2020 11:35:03 +0800
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: server4
AcquireTime: <unset>
RenewTime: Sat, 24 Oct 2020 15:44:29 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Sat, 24 Oct 2020 02:58:22 +0800 Sat, 24 Oct 2020 02:58:22 +0800 FlannelIsUp Flannel is running on this node
MemoryPressure False Sat, 24 Oct 2020 15:40:55 +0800 Tue, 20 Oct 2020 16:05:27 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sat, 24 Oct 2020 15:40:55 +0800 Tue, 20 Oct 2020 16:05:27 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Sat, 24 Oct 2020 15:40:55 +0800 Tue, 20 Oct 2020 16:05:27 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sat, 24 Oct 2020 15:40:55 +0800 Tue, 20 Oct 2020 16:05:27 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.137.124
Hostname: server4
Capacity:
cpu: 8
ephemeral-storage: 51175Mi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16235012Ki
pods: 110
Allocatable:
cpu: 8
ephemeral-storage: 48294789041
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16132612Ki
pods: 110
System Info:
Machine ID: 472219fb3dcf4a31be3d3913de539489
System UUID: 472219fb3dcf4a31be3d3913de539489
Boot ID: 15ad69df-abd5-45e8-ae22-e54a6bd37661
Kernel Version: 3.10.0-1127.19.1.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.13
Kubelet Version: v1.19.2
Kube-Proxy Version: v1.19.2
PodCIDR: 10.244.1.0/24
PodCIDRs: 10.244.1.0/24
Non-terminated Pods: (11 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default aml2owl-f5966bd64-7z22g 0 (0%) 0 (0%) 0 (0%) 0 (0%) 26d
default consul-consul-svhkq 100m (1%) 100m (1%) 100Mi (0%) 100Mi (0%) 6d2h
default dashboard-kubernetes-dashboard-b5944fc7c-2v8zd 100m (1%) 2 (25%) 200Mi (1%) 200Mi (1%) 27d
default nfs-nfs-client-provisioner-74bbbb5bf8-nd8cz 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24d
default orderpage-backend-669b57b7fc-fcbbm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d1h
default orderpage-frontend-655cdbd6c-x7cxf 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d1h
default scut100-9d6d68669-vp2kc 0 (0%) 0 (0%) 0 (0%) 0 (0%) 26d
ingress-nginx ingress-nginx-controller-d7f9d68cf-np9xk 100m (1%) 0 (0%) 90Mi (0%) 0 (0%) 5d1h
kube-system coredns-f9fd979d6-8qtjk 100m (1%) 0 (0%) 70Mi (0%) 170Mi (1%) 27d
kube-system kube-flannel-ds-tw9b5 100m (1%) 100m (1%) 50Mi (0%) 50Mi (0%) 27d
kube-system kube-proxy-9nkbd 0 (0%) 0 (0%) 0 (0%) 0 (0%) 27d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 500m (6%) 2200m (27%)
memory 510Mi (3%) 520Mi (3%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
@BigeYoung thanks for trying all these out and it was my mistake on the yaml file, you can change that file to be used with the full chart but I will manage to update that file so it doesn't cause problems. I notice that you still are requesting
2gb
of memory, so even if the limits are higher the JVM might be chocking up on memory (hence you still getting OOM with error 137). Can you tryresources: limits: cpu: 2 memory: 6Gi requests: cpu: 1 memory: 3Gi
? I will create a cluster today on GKE to double check and see if I can reproduce your error. I just need to finish some stuff first
Yes, and here's the result.
➜ ~ kubectl describe pods test-core-zeebe-0
Name: test-core-zeebe-0
Namespace: default
Priority: 0
Node: server4/192.168.137.124
Start Time: Sun, 25 Oct 2020 13:52:41 +0800
Labels: app.kubernetes.io/component=broker
app.kubernetes.io/instance=test-core
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=zeebe-cluster
controller-revision-hash=test-core-zeebe-7c68646d8f
statefulset.kubernetes.io/pod-name=test-core-zeebe-0
Annotations: <none>
Status: Running
IP: 10.244.1.13
IPs:
IP: 10.244.1.13
Controlled By: StatefulSet/test-core-zeebe
Containers:
zeebe-cluster:
Container ID: docker://7aa201fa9d321354411a2b37e45f4b59265c029c7257e6cc93c28aad8036b276
Image: camunda/zeebe:0.24.2
Image ID: docker-pullable://camunda/zeebe@sha256:795ace31c498ad4bc37b7b0fab612307c34852f4187766e3f777a509821c9fb3
Ports: 9600/TCP, 26501/TCP, 26502/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Sun, 25 Oct 2020 13:54:14 +0800
Finished: Sun, 25 Oct 2020 13:55:02 +0800
Ready: False
Restart Count: 2
Limits:
cpu: 3
memory: 6Gi
Requests:
cpu: 2
memory: 3Gi
Readiness: http-get http://:9600/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
ZEEBE_BROKER_CLUSTER_CLUSTERNAME: test-core-zeebe
ZEEBE_LOG_LEVEL:
ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT: 3
ZEEBE_BROKER_CLUSTER_CLUSTERSIZE: 3
ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR: 3
ZEEBE_BROKER_THREADS_CPUTHREADCOUNT: 2
ZEEBE_BROKER_THREADS_IOTHREADCOUNT: 2
ZEEBE_BROKER_GATEWAY_ENABLE: false
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_CLASSNAME: io.zeebe.exporter.ElasticsearchExporter
ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_URL: http://elasticsearch-master:9200
ZEEBE_BROKER_NETWORK_COMMANDAPI_PORT: 26501
ZEEBE_BROKER_NETWORK_INTERNALAPI_PORT: 26502
ZEEBE_BROKER_NETWORK_MONITORINGAPI_PORT: 9600
K8S_POD_NAME: test-core-zeebe-0 (v1:metadata.name)
JAVA_TOOL_OPTIONS: -XX:MaxRAMPercentage=25.0 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/zeebe/data -XX:ErrorFile=/usr/local/zeebe/data/zeebe_error%p.log -XX:+ExitOnOutOfMemoryError
Mounts:
/exporters from exporters (rw)
/usr/local/bin/startup.sh from config (rw,path="startup.sh")
/usr/local/zeebe/config/application.yaml from config (rw,path="application.yaml")
/usr/local/zeebe/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-mt74h (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-test-core-zeebe-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: test-core-zeebe-cluster
Optional: false
exporters:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
default-token-mt74h:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-mt74h
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m43s default-scheduler 0/4 nodes are available: 4 pod has unbound immediate PersistentVolumeClaims.
Normal Scheduled 2m43s default-scheduler Successfully assigned default/test-core-zeebe-0 to server4
Warning Unhealthy 2m26s kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Normal Pulled 70s (x3 over 2m41s) kubelet Container image "camunda/zeebe:0.24.2" already present on machine
Normal Created 70s (x3 over 2m41s) kubelet Created container zeebe-cluster
Normal Started 70s (x3 over 2m41s) kubelet Started container zeebe-cluster
Warning Unhealthy 66s (x3 over 2m36s) kubelet Readiness probe failed: Get "http://10.244.1.13:9600/ready": dial tcp 10.244.1.13:9600: connect: connection refused
Warning BackOff 10s (x3 over 82s) kubelet Back-off restarting failed container
@BigeYoung thanks a lot for all these details, a couple of more questions from my side: 1) Is this GKE or hosted on Prem? 2) Are you using Istio or any other service mesh?
@BigeYoung thanks a lot for all these details, a couple of more questions from my side:
- Is this GKE or hosted on Prem?
- Are you using Istio or any other service mesh?
@BigeYoung that is what I was afraid.. because it is an "On-Premise" cluster, it is pretty difficult for us to reproduce. It can be a bunch of things going wrong, like networking or storage. I can see that you are using flannel there.. so again, it is pretty difficult for me to replicate.
Can you run other workloads in that cluster? You mention consul, why is that failing?
@BigeYoung that is what I was afraid.. because it is an "On-Premise" cluster, it is pretty difficult for us to reproduce. It can be a bunch of things going wrong, like networking or storage. I can see that you are using flannel there.. so again, it is pretty difficult for me to replicate.
Can you run other workloads in that cluster? You mention consul, why is that failing?
Oh I'm sorry that I didn't make myself clear, so you misunderstood me... All the other workloads running on the cluster are running very well, including Consul. I mean, I just put Consul on the cluster for testing, and haven't used it to connect with other loads.
I tried to start it directly using Docker. Strangely, it doesn't print any errors, but it stops running after the boot step.
[bige@server4 ~]$ sudo docker run camunda/zeebe:0.24.2
++ hostname -i
+ export ZEEBE_HOST=172.17.0.2
+ ZEEBE_HOST=172.17.0.2
+ '[' false = true ']'
+ export ZEEBE_BROKER_NETWORK_HOST=172.17.0.2
+ ZEEBE_BROKER_NETWORK_HOST=172.17.0.2
+ export ZEEBE_BROKER_GATEWAY_CLUSTER_HOST=172.17.0.2
+ ZEEBE_BROKER_GATEWAY_CLUSTER_HOST=172.17.0.2
+ exec /usr/local/zeebe/bin/broker
______ ______ ______ ____ ______ ____ _____ ____ _ __ ______ _____
|___ / | ____| | ____| | _ \ | ____| | _ \ | __ \ / __ \ | |/ / | ____| | __ \
/ / | |__ | |__ | |_) | | |__ | |_) | | |__) | | | | | | ' / | |__ | |__) |
/ / | __| | __| | _ < | __| | _ < | _ / | | | | | < | __| | _ /
/ /__ | |____ | |____ | |_) | | |____ | |_) | | | \ \ | |__| | | . \ | |____ | | \ \
/_____| |______| |______| |____/ |______| |____/ |_| \_\ \____/ |_|\_\ |______| |_| \_\
2020-10-27 01:28:13.566 [] [main] INFO io.zeebe.broker.StandaloneBroker - Starting StandaloneBroker v0.24.2 on 66e2d4590f6f with PID 6 (/usr/local/zeebe/lib/zeebe-distribution-0.24.2.jar started by root in /usr/local/zeebe)
2020-10-27 01:28:13.584 [] [main] INFO io.zeebe.broker.StandaloneBroker - No active profile set, falling back to default profiles: default
2020-10-27 01:28:19.038 [] [main] INFO org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat initialized with port(s): 9600 (http)
2020-10-27 01:28:19.093 [] [main] INFO org.apache.coyote.http11.Http11NioProtocol - Initializing ProtocolHandler ["http-nio-172.17.0.2-9600"]
2020-10-27 01:28:19.095 [] [main] INFO org.apache.catalina.core.StandardService - Starting service [Tomcat]
2020-10-27 01:28:19.096 [] [main] INFO org.apache.catalina.core.StandardEngine - Starting Servlet engine: [Apache Tomcat/9.0.36]
2020-10-27 01:28:19.379 [] [main] INFO org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring embedded WebApplicationContext
2020-10-27 01:28:19.380 [] [main] INFO org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext - Root WebApplicationContext: initialization completed in 5645 ms
2020-10-27 01:28:20.558 [] [main] INFO org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor - Initializing ExecutorService 'applicationTaskExecutor'
2020-10-27 01:28:21.282 [] [main] INFO org.springframework.boot.actuate.endpoint.web.EndpointLinksResolver - Exposing 2 endpoint(s) beneath base path '/actuator'
2020-10-27 01:28:21.373 [] [main] INFO org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-172.17.0.2-9600"]
2020-10-27 01:28:21.565 [] [main] INFO org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 9600 (http) with context path ''
2020-10-27 01:28:21.624 [] [main] INFO io.zeebe.broker.StandaloneBroker - Started StandaloneBroker in 9.67 seconds (JVM running for 13.319)
2020-10-27 01:28:21.813 [] [main] INFO io.zeebe.broker.system - Version: 0.24.2
2020-10-27 01:28:21.940 [] [main] INFO io.zeebe.broker.system - Starting broker 0 with configuration {
"network" : {
"host" : "172.17.0.2",
"portOffset" : 0,
"maxMessageSize" : "4MB",
"advertisedHost" : "172.17.0.2",
"commandApi" : {
"host" : "172.17.0.2",
"port" : 26501,
"advertisedHost" : "172.17.0.2",
"advertisedPort" : 26501,
"advertisedAddress" : "172.17.0.2:26501",
"address" : "172.17.0.2:26501"
},
"internalApi" : {
"host" : "172.17.0.2",
"port" : 26502,
"advertisedHost" : "172.17.0.2",
"advertisedPort" : 26502,
"advertisedAddress" : "172.17.0.2:26502",
"address" : "172.17.0.2:26502"
},
"monitoringApi" : {
"host" : "172.17.0.2",
"port" : 9600,
"advertisedHost" : "172.17.0.2",
"advertisedPort" : 9600,
"advertisedAddress" : "172.17.0.2:9600",
"address" : "172.17.0.2:9600"
},
"maxMessageSizeInBytes" : 4194304
},
"cluster" : {
"initialContactPoints" : [ ],
"partitionIds" : [ 1 ],
"nodeId" : 0,
"partitionsCount" : 1,
"replicationFactor" : 1,
"clusterSize" : 1,
"clusterName" : "zeebe-cluster",
"membership" : {
"broadcastUpdates" : false,
"broadcastDisputes" : true,
"notifySuspect" : false,
"gossipInterval" : "PT0.25S",
"gossipFanout" : 2,
"probeInterval" : "PT1S",
"probeTimeout" : "PT2S",
"suspectProbes" : 3,
"failureTimeout" : "PT10S",
"syncInterval" : "PT10S"
}
},
"threads" : {
"cpuThreadCount" : 2,
"ioThreadCount" : 2
},
"data" : {
"directories" : [ "/usr/local/zeebe/data" ],
"logSegmentSize" : "512MB",
"snapshotPeriod" : "PT15M",
"logIndexDensity" : 100,
"logSegmentSizeInBytes" : 536870912,
"atomixStorageLevel" : "DISK"
},
"exporters" : { },
"gateway" : {
"network" : {
"host" : "0.0.0.0",
"port" : 26500,
"minKeepAliveInterval" : "PT30S"
},
"cluster" : {
"contactPoint" : "172.17.0.2:26502",
"requestTimeout" : "PT15S",
"clusterName" : "zeebe-cluster",
"memberId" : "gateway",
"host" : "172.17.0.2",
"port" : 26502,
"membership" : {
"broadcastUpdates" : false,
"broadcastDisputes" : true,
"notifySuspect" : false,
"gossipInterval" : "PT0.25S",
"gossipFanout" : 2,
"probeInterval" : "PT1S",
"probeTimeout" : "PT2S",
"suspectProbes" : 3,
"failureTimeout" : "PT10S",
"syncInterval" : "PT10S"
}
},
"threads" : {
"managementThreads" : 1
},
"monitoring" : {
"enabled" : false,
"host" : "172.17.0.2",
"port" : 9600
},
"security" : {
"enabled" : false,
"certificateChainPath" : null,
"privateKeyPath" : null
},
"longPolling" : {
"enabled" : true
},
"initialized" : true,
"enable" : true
},
"backpressure" : {
"enabled" : true,
"algorithm" : "VEGAS",
"aimd" : {
"requestTimeout" : "PT1S",
"initialLimit" : 100,
"minLimit" : 1,
"maxLimit" : 1000,
"backoffRatio" : 0.9
},
"fixedLimit" : {
"limit" : 20
},
"vegas" : {
"alpha" : 3,
"beta" : 6,
"initialLimit" : 20
},
"gradient" : {
"minLimit" : 10,
"initialLimit" : 20,
"rttTolerance" : 2.0
},
"gradient2" : {
"minLimit" : 10,
"initialLimit" : 20,
"rttTolerance" : 2.0,
"longWindow" : 600
}
},
"stepTimeout" : "PT5M",
"executionMetricsExporterEnabled" : false
}
2020-10-27 01:28:21.992 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [1/11]: actor scheduler
2020-10-27 01:28:22.028 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [2/11]: membership and replication protocol
2020-10-27 01:28:26.489 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [3/11]: command api transport
2020-10-27 01:28:27.022 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [4/11]: command api handler
2020-10-27 01:28:27.136 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [5/11]: subscription api
2020-10-27 01:28:27.221 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [6/11]: embedded gateway
2020-10-27 01:28:27.233 [] [main] INFO io.zeebe.gateway - Version: 0.24.2
2020-10-27 01:28:27.239 [] [main] INFO io.zeebe.gateway - Starting gateway with configuration {
"network" : {
"host" : "0.0.0.0",
"port" : 26500,
"minKeepAliveInterval" : "PT30S"
},
"cluster" : {
"contactPoint" : "172.17.0.2:26502",
"requestTimeout" : "PT15S",
"clusterName" : "zeebe-cluster",
"memberId" : "gateway",
"host" : "172.17.0.2",
"port" : 26502,
"membership" : {
"broadcastUpdates" : false,
"broadcastDisputes" : true,
"notifySuspect" : false,
"gossipInterval" : "PT0.25S",
"gossipFanout" : 2,
"probeInterval" : "PT1S",
"probeTimeout" : "PT2S",
"suspectProbes" : 3,
"failureTimeout" : "PT10S",
"syncInterval" : "PT10S"
}
},
"threads" : {
"managementThreads" : 1
},
"monitoring" : {
"enabled" : false,
"host" : "172.17.0.2",
"port" : 9600
},
"security" : {
"enabled" : false,
"certificateChainPath" : null,
"privateKeyPath" : null
},
"longPolling" : {
"enabled" : true
},
"initialized" : true,
"enable" : true
}
2020-10-27 01:28:27.629 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [7/11]: cluster services
2020-10-27 01:28:28.195 [] [raft-server-0-raft-partition-partition-1] WARN io.atomix.utils.event.ListenerRegistry - Listener io.atomix.raft.roles.FollowerRole$$Lambda$946/0x000000080079c840@764b03b6 not registered
2020-10-27 01:28:28.428 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [8/11]: topology manager
2020-10-27 01:28:28.432 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [9/11]: monitoring services
2020-10-27 01:28:28.442 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [10/11]: leader management request handler
2020-10-27 01:28:28.448 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 [11/11]: zeebe partitions
2020-10-27 01:28:28.453 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 partitions [1/1]: partition 1
2020-10-27 01:28:30.836 [Broker-0-StreamProcessor-1] [Broker-0-zb-actors-0] INFO org.camunda.feel.FeelEngine - Engine created. [value-mapper: CompositeValueMapper(List(io.zeebe.el.impl.feel.MessagePackValueMapper@732f28bf)), function-provider: io.zeebe.el.impl.feel.FeelFunctionProvider@1fcba24f, configuration: Configuration(false)]
2020-10-27 01:28:31.152 [Broker-0-StreamProcessor-1] [Broker-0-zb-actors-0] INFO io.zeebe.logstreams - Recovered state of partition 1 from snapshot at position -1
2020-10-27 01:28:31.195 [Broker-0-StreamProcessor-1] [Broker-0-zb-actors-0] INFO org.camunda.feel.FeelEngine - Engine created. [value-mapper: CompositeValueMapper(List(io.zeebe.el.impl.feel.MessagePackValueMapper@3cda798c)), function-provider: io.zeebe.el.impl.feel.FeelFunctionProvider@28b889ef, configuration: Configuration(false)]
2020-10-27 01:28:31.228 [Broker-0-StreamProcessor-1] [Broker-0-zb-actors-0] INFO org.camunda.feel.FeelEngine - Engine created. [value-mapper: CompositeValueMapper(List(io.zeebe.el.impl.feel.MessagePackValueMapper@cdd70a)), function-provider: io.zeebe.el.impl.feel.FeelFunctionProvider@19d449ed, configuration: Configuration(false)]
2020-10-27 01:28:31.582 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 partitions succeeded. Started 1 steps in 3128 ms.
2020-10-27 01:28:31.583 [] [main] INFO io.zeebe.broker.system - Bootstrap Broker-0 succeeded. Started 11 steps in 9593 ms.
And then the docker container stopped.
I found the problem. My server was invaded by a mining virus, which caused an extremely high CPU usage. Sorry for wasting your time.
@BigeYoung oh wow.. so it was running out of memory.. that is crazy, I am sorry to hear that.. can you confirm that now it is working for you?
@BigeYoung oh wow.. so it was running out of memory.. that is crazy, I am sorry to hear that.. can you confirm that now it is working for you?
Yes, it works well. Sorry for wasting your time, and thank you very much for your patience!
@BigeYoung no worries, I am here to help.. feel free to reach out if you find more issues with zeebe or if want to provide feedback about it.
Zeebe has been running on my Kubernetes for a while, and it has been working very well. Until recently, I found that a Pod kept reporting errors and restarting. I tried to use helm to delete the chart, then use kubectl to delete all pvc/pv, even enter the node and use docker to remove the image, and use helm to reinstall zeebe, I repeated this step many times, but the problem still exists.
...and for the log
Version Info: