cetic / helm-nifi

Helm Chart for Apache Nifi
Apache License 2.0
215 stars 225 forks source link

[cetic/nifi] CrashLoopBackOff #39

Closed devopsdymyr closed 3 years ago

devopsdymyr commented 4 years ago

Describe the bug A clear and concise description of what the bug is. i have deployed nifi using helm chart which is working fine after 20 days later got below error

Version of Helm and Kubernetes: ubuntu@ip-10-0-0-202:~$ helm version --short Client: v2.14.3+g0e7f3b6 Server: v2.14.2+ga8b13cc

What happened:

image

error:

2020-01-02 04:50:52,662 INFO [main] o.a.n.c.c.node.NodeClusterCoordinator dev-nifi-2.dev-nifi-headless.dev.svc.cluster.local:8080 requested disconnection from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow. 2020-01-02 04:50:52,663 INFO [main] o.a.n.c.c.node.NodeClusterCoordinator Status of dev-nifi-2.dev-nifi-headless.dev.svc.cluster.local:8080 changed from NodeConnectionStatus[nodeId=dev-nifi-2.dev-nifi-headless.dev.svc.cluster.local:8080, state=CONNECTING, updateId=1754] to NodeConnectionStatus[nodeId=dev-nifi-2.dev-nifi-headless.dev.svc.cluster.local:8080, state=DISCONNECTED, Disconnect Code=Node's Flow did not Match Cluster Flow, Disconnect Reason=org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow., updateId=1754] 2020-01-02 04:50:52,672 ERROR [main] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for dev-nifi-2.dev-nifi-headless.dev.svc.cluster.local:8080 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow. 2020-01-02 04:50:52,672 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager Cannot unregister Leader Election Role 'Primary Node' becuase that role is not registered 2020-01-02 04:50:52,673 WARN [main] org.apache.nifi.web.server.JettyServer Failed to start web server... shutting down. java.lang.IllegalStateException: Already closed or has not been started at com.google.common.base.Preconditions.checkState(Preconditions.java:173) at org.apache.curator.framework.recipes.leader.LeaderSelector.close(LeaderSelector.java:270) at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager.unregister(CuratorLeaderElectionManager.java:152) at org.apache.nifi.controller.FlowController.setClustered(FlowController.java:2217) at org.apache.nifi.controller.StandardFlowService.handleConnectionFailure(StandardFlowService.java:578) at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:542) at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:1009) at org.apache.nifi.NiFi.(NiFi.java:158) at org.apache.nifi.NiFi.(NiFi.java:72) at org.apache.nifi.NiFi.main(NiFi.java:297) 2020-01-02 04:50:52,673 INFO [Thread-1] org.apache.nifi.NiFi Initiating shutdown of Jetty web server... 2020-01-02 04:50:52,677 INFO [Process Cluster Protocol Request-2] o.a.n.c.c.node.NodeClusterCoordinator Status of dev-nifi-2.dev-nifi-headless.dev.svc.cluster.local:8080 changed from NodeConnectionStatus[nodeId=dev-nifi-2.dev-nifi-headless.dev.svc.cluster.local:8080, state=DISCONNECTED, Disconnect Code=Node's Flow did not Match Cluster Flow, Disconnect Reason=org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow., updateId=1754] to NodeConnectionStatus[nodeId=dev-nifi-2.dev-nifi-headless.dev.svc.cluster.local:8080, state=DISCONNECTED, Disconnect Code=Node's Flow did not Match Cluster Flow, Disconnect Reason=org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow., updateId=1754] 2020-01-02 04:50:52,677 INFO [Process Cluster Protocol Request-2] o.a.n.c.p.impl.SocketProtocolListener Finished processing request 0d664ace-24f4-4443-af02-235a34f0cc44 (type=NODE_STATUS_CHANGE, length=1639 bytes) from dev-nifi-1.dev-nifi-headless.dev.svc.cluster.local in 1 millis 2020-01-02 04:50:52,679 INFO [Thread-1] o.eclipse.jetty.server.AbstractConnector Stopped ServerConnector@3a94d716{HTTP/1.1,[http/1.1]}{dev-nifi-2.dev-nifi-headless.dev.svc.cluster.local:8080} 2020-01-02 04:50:52,679 INFO [Thread-1] org.eclipse.jetty.server.session node0 Stopped scavenging

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

banzo commented 4 years ago

Can you compare the flow.xml.gz in the various nodes?

devopsdymyr commented 4 years ago

file is in path,let me know how i can compare image

ados1991 commented 4 years ago

hello any update ? iam facing the same issue

scottwallred commented 4 years ago

Anyone have any luck in coming up with a solid solution yet for this issue?

banzo commented 4 years ago

file is in path,let me know how i can compare image

Get the flow.xml.gz files from the various pods, untar them and compare them, they should be the same and if not please tell us what the difference is.

@ados1991 @scottwallred can you provide some pointers on how to reproduce the issue? Something that happens after 20 days is difficult to debug.

chancethecoder commented 3 years ago

I just downgraded image.tag from 1.12.1 to 1.11.0 and CrashLoopBackOff disappeared.

Error messages what i had before:

Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'flowService': FactoryBean threw exception on object creation; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'flowController' defined in class path resource [nifi-context.xml]: Cannot resolve reference to bean 'clusterCoordinator' while setting bean property 'clusterCoordinator'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'clusterCoordinator': FactoryBean threw exception on object creation; nested exception is java.lang.IllegalStateException: Error configuring TLS for state manager ... Caused by: java.lang.IllegalStateException: Error configuring TLS for state manager ... Caused by: org.apache.nifi.security.util.TlsException: The keystore properties are not valid