I'm using solr-operator v0.7.0 and zookeeper-operator 0.2.15.
SOLR version is 6.6.2.
One of our SOLR clusters has 17 nodes (17 shards of around 100GB with 1 replica).
From time to time some shards starts failing and error in the logs is:
2023-06-28 10:15:47.505 ERROR (recoveryExecutor-3-thread-1-processing-n:solr-xxx-solrcloud-11.solr-xxx-solrcloud-headless.xxx:8983_solr x:xxx_shard13_replica2 s:shard13 c:xxx r:core_node77) [c:xxx s:shard13 r:core_node77 x:xxx_shard13_replica2] o.a.s.c.RecoveryStrategy Error while trying to recover. core=xxx_shard13_replica2:org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: xxx slice: shard13
at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:748)
at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:734)
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:368)
I've added solrOpts: "-DzkClientTimeout=1200000" but it cannot be the timeout I have.
I tried solrZkOpts: '-Dzookeeper.connection.timeout.ms=60000' but that did not change anything.
Could you please advice what I can do about this issue?
Hi,
I'm using solr-operator v0.7.0 and zookeeper-operator 0.2.15. SOLR version is 6.6.2.
One of our SOLR clusters has 17 nodes (17 shards of around 100GB with 1 replica). From time to time some shards starts failing and error in the logs is:
I've added
solrOpts: "-DzkClientTimeout=1200000"
but it cannot be the timeout I have. I triedsolrZkOpts: '-Dzookeeper.connection.timeout.ms=60000'
but that did not change anything.Could you please advice what I can do about this issue?
Thanks!