apache / solr-operator

Official Kubernetes operator for Apache Solr
https://solr.apache.org/operator
Apache License 2.0
243 stars 112 forks source link

Getting crashloopbackoff error in zookeeper cluster #577

Closed nandan-dm closed 1 year ago

nandan-dm commented 1 year ago

Hi I have installed solar operator in solar namespace which will install bot solar and zookeeper in different name space i.e jx-qat1, jx-qat2, jx-qat3 but in jx-qat2 and jx-qat3 the zookeeper is broken I am getting the below error.

ERROR [ListenerHandler-tonline-ocr-solrcloud-zookeeper-1.tonline-ocr-solrcloud-zookeeper-headless.jx-qat1.svc.cluster.local/10.96.145.199:3888:QuorumCnxManager$Listener$ListenerHandler@1094] - Exception while listening java.net.BindException: Cannot assign requested address (Bind failed) at java.base/java.net.PlainSocketImpl.socketBind(Native Method) at java.base/java.net.AbstractPlainSocketImpl.bind(Unknown Source) at java.base/java.net.ServerSocket.bind(Unknown Source) at java.base/java.net.ServerSocket.bind(Unknown Source) at org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener$ListenerHandler.createNewServerSocket(QuorumCnxManager.java:1136) at org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener$ListenerHandler.acceptConnections(QuorumCnxManager.java:1065) at org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener$ListenerHandler.run(QuorumCnxManager.java:1034) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source)

jx-qat1 pod/tonline-ocr-solrcloud-0 1/1 Running 0 99m jx-qat1 pod/tonline-ocr-solrcloud-1 1/1 Running 0 98m jx-qat1 pod/tonline-ocr-solrcloud-zookeeper-0 1/1 Running 0 21h jx-qat1 pod/tonline-ocr-solrcloud-zookeeper-1 1/1 Running 0 18m jx-qat2 pod/tonline-ocr-solrcloud-0 0/1 CrashLoopBackOff 9 (62s ago) 19m jx-qat2 pod/tonline-ocr-solrcloud-1 0/1 CrashLoopBackOff 31 (3m12s ago) 95m jx-qat2 pod/tonline-ocr-solrcloud-zookeeper-0 1/1 Running 0 19h jx-qat2 pod/tonline-ocr-solrcloud-zookeeper-1 0/1 CrashLoopBackOff 97 (108s ago) 8h jx-qat3 pod/tonline-ocr-solrcloud-0 0/1 CrashLoopBackOff 13 (46s ago) 33m jx-qat3 pod/tonline-ocr-solrcloud-1 0/1 Running 32 (5m14s ago) 96m jx-qat3 pod/tonline-ocr-solrcloud-zookeeper-0 1/1 Running 0 6d2h jx-qat3 pod/tonline-ocr-solrcloud-zookeeper-1 0/1 ContainerCreating 0 8d solr pod/solr-operator-79547c485f-zmmwl 1/1 Running 0 8h solr pod/solr-operator-zookeeper-operator-7fb8c6f4c8-x7xx6 1/1 Running 0 4h57m jx-qat1 service/tonline-ocr-solrcloud-common ClusterIP 172.20.76.166 80/TCP 17d jx-qat1 service/tonline-ocr-solrcloud-headless ClusterIP None 8983/TCP 17d jx-qat1 service/tonline-ocr-solrcloud-zookeeper-admin-server ClusterIP 172.20.66.84 8080/TCP 17d jx-qat1 service/tonline-ocr-solrcloud-zookeeper-client ClusterIP 172.20.91.197 2181/TCP 17d jx-qat1 service/tonline-ocr-solrcloud-zookeeper-headless ClusterIP None 2181/TCP,2888/TCP,3888/TCP,7000/TCP,8080/TCP 17d jx-qat2 service/tonline-ocr-solrcloud-common ClusterIP 172.20.186.199 80/TCP 20d jx-qat2 service/tonline-ocr-solrcloud-headless ClusterIP None 8983/TCP 20d jx-qat2 service/tonline-ocr-solrcloud-zookeeper-admin-server ClusterIP 172.20.101.103 8080/TCP 20d jx-qat2 service/tonline-ocr-solrcloud-zookeeper-client ClusterIP 172.20.172.26 2181/TCP 20d jx-qat2 service/tonline-ocr-solrcloud-zookeeper-headless ClusterIP None 2181/TCP,2888/TCP,3888/TCP,7000/TCP,8080/TCP 20d jx-qat3 service/tonline-ocr-solrcloud-common ClusterIP 172.20.228.32 80/TCP 20d jx-qat3 service/tonline-ocr-solrcloud-headless ClusterIP None 8983/TCP 20d jx-qat3 service/tonline-ocr-solrcloud-zookeeper-admin-server ClusterIP 172.20.201.75 8080/TCP 20d jx-qat3 service/tonline-ocr-solrcloud-zookeeper-client ClusterIP 172.20.127.234 2181/TCP 20d jx-qat3 service/tonline-ocr-solrcloud-zookeeper-headless ClusterIP None 2181/TCP,2888/TCP,3888/TCP,7000/TCP,8080/TCP 20d solr deployment.apps/solr-operator 1/1 1 1 625d solr deployment.apps/solr-operator-zookeeper-operator 1/1 1 1 8d solr replicaset.apps/solr-operator-5df8fc5994 0 0 0 569d solr replicaset.apps/solr-operator-64c97fc7f4 0 0 0 16d solr replicaset.apps/solr-operator-7746c6d8b4 0 0 0 625d solr replicaset.apps/solr-operator-79547c485f 1 1 1 246d solr replicaset.apps/solr-operator-7f9bb8cc49 0 0 0 374d solr replicaset.apps/solr-operator-f8978f488 0 0 0 598d solr replicaset.apps/solr-operator-zookeeper-operator-5c89b6f8bf 0 0 0 8d solr replicaset.apps/solr-operator-zookeeper-operator-7fb8c6f4c8 1 1 1 19h jx-qat1 statefulset.apps/tonline-ocr-solrcloud 2/2 17d jx-qat1 statefulset.apps/tonline-ocr-solrcloud-zookeeper 2/2 17d jx-qat2 statefulset.apps/tonline-ocr-solrcloud 0/2 20d jx-qat2 statefulset.apps/tonline-ocr-solrcloud-zookeeper 1/2 8d jx-qat3 statefulset.apps/tonline-ocr-solrcloud 0/2 20d jx-qat3 statefulset.apps/tonline-ocr-solrcloud-zookeeper 1/2 8d

nandan-dm commented 1 year ago

Here is the more logs of statefulset of zookeeper

Found 2 pods, using pod/tonline-ocr-solrcloud-zookeeper-0

HoustonPutman commented 1 year ago

This sounds like an issue with the Zookeeper Operator (which actually runs the provided zookeeper cluster).

I would recommend either: