apache / solr-operator

Official Kubernetes operator for Apache Solr
https://solr.apache.org/operator
Apache License 2.0
242 stars 112 forks source link

Actual running pod counts are different from the HPA-allocated #688

Closed sabaribose closed 3 months ago

sabaribose commented 4 months ago

I have configured HPA for the Solrcloud pods when I see the HPA status from the Kubernetes cluster - it returns 10 pods are allocated, but from the nodes page in the Solrcloud admin UI - it shows more than the HPA allocated - (58) pods are running.

Versions: SolrCloud 9.3 solr-operator 0.8.0

here are some logs from the cluster

solrcloud_annotations.txt solrcloud_operator_logs.txt solrcloud_output.txt

HoustonPutman commented 4 months ago

I think there are two issues here: The Solr Operator is deleting PVCs of pods before the pods are being deleted. That is obviously a problem, and likely causing the BalanceReplicas command to fail.

There is also a bug that results in the BalanceReplicas command continuously going even though it should be paused to start scaling down. The PVC issue mentioned above causes the BalanceReplicas command to fail, and thus this bug is seen.