apache / solr-operator

Official Kubernetes operator for Apache Solr
https://solr.apache.org/operator
Apache License 2.0
243 stars 112 forks source link

zookeeper restart #580

Closed klamkma closed 1 year ago

klamkma commented 1 year ago

Hello,

I'm using solr-operator v0.7.0 and zookeeper-operator 0.2.15.

If all my 3 zookeeper nodes are restarted, solr nodes are not able to reconnect. They all fail with the same error:

2023-06-28 09:52:03.814 WARN  (main-SendThread(solr-xxx-solrcloud-zookeeper-0.solr-xxx-solrcloud-zookeeper-headless.xxx.svc.cluster.local:2181)) [   ] o.a.z.ClientCnxn Client session timed out, have not heard from server in 26680ms for sessionid 0x10005dbcccf0014
3
2023-06-28 09:52:12.211 INFO  (qtp596910004-43) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=8
2
2023-06-28 09:52:12.211 INFO  (qtp596910004-1438) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=8
1
2023-06-28 09:52:22.210 INFO  (qtp596910004-43) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=7

Could you please advice what I can do about this issue?

Thanks.

HoustonPutman commented 1 year ago

What version of Solr are you running?

klamkma commented 1 year ago

What version of Solr are you running?

6.6.2

HoustonPutman commented 1 year ago

That might be the issue, that version of Solr is likely using a version of the Zookeeper API that had a bug when trying to retry addresses.

https://issues.apache.org/jira/browse/ZOOKEEPER-2184 (3.4.13)

Solr 6.6.2 is using Zookeeper 3.4.10, so you will see this bug.

Solr 6.6.2 has been EOL for for quite a while now, so I suggest you try updating to a newer version of Solr.

klamkma commented 1 year ago

Thank you for the information!