splunk / splunk-operator

Splunk Operator for Kubernetes
Other
209 stars 115 forks source link

Splunk Operator: downscaling doesn't work #1319

Closed yaroslav-nakonechnikov closed 2 weeks ago

yaroslav-nakonechnikov commented 7 months ago

Please select the type of request

Bug

Tell us more

Describe the request As it was described in https://github.com/splunk/splunk-operator/issues/1272 and after testing i can confirm, that upscaling works, but downscaling is not working.

i used that definition:

$ cat scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: keda-sa
  namespace: splunk-operator
spec:
  scaleTargetRef:
    apiVersion: enterprise.splunk.com/v4
    kind: IndexerCluster
    name: indexer
  minReplicaCount: 0
  cooldownPeriod: 60
  triggers:
  - type: cron
    metadata:
      timezone: Europe/Vienna
      start: 0 8 * * 1-5
      end: 0 18 * * 1-5
      desiredReplicas: "3"

and it scaled in just fine. But when time came to downscale - it didn't worked. last indexer was just restarting and in cluster manager it was in a loop with states: up, decomissioning.

Expected behavior Downscale and upscale works as it is expected in kubernetes. There should be no extra logic added, which may block it. otherwise it should be explained in documentation.

yaroslav-nakonechnikov commented 7 months ago

also, keda not working with licensemanager, clustermanager and standalone. it says that it can't finde target: image

yaroslav-nakonechnikov commented 6 months ago

i did it with workaround: added keda to downscale crd and at same time added same for statefulset.
first is needed to tell splunk-operator to stop re-create statefulset, and second actually downscales.

yaroslav-nakonechnikov commented 6 months ago

here is correction: final workaround - downscale splunk-operator controller, so it doesn't affect anything else. And then we operate only with statefulsets.

for turning on/off dev environment only for operational hours works well

vivekr-splunk commented 6 months ago

Hi @yaroslav-nakonechnikov, you can apply annotations to the custom resources (CRs) to prevent interference with statefulsets as long as the annotations are present. Here are the annotations for the various CR types:

"clustermanager.enterprise.splunk.com/paused"
"indexercluster.enterprise.splunk.com/paused"
"licensemanager.enterprise.splunk.com/paused"
"monitoringconsole.enterprise.splunk.com/paused"
"searchheadcluster.enterprise.splunk.com/paused"
"standalone.enterprise.splunk.com/paused"

I'd like to know the steps you follow when you start working directly with statefulsets. Specifically, when scaling down, we need to ensure that decommissioning is run on Splunk, which the operator handles.

Can you provide more detail on what issues you're encountering with the scale-down process? Your help is much appreciated. Thank you.

yaroslav-nakonechnikov commented 6 months ago

good to know about annotations.

atm we are using downscaling only for saving costs on dev and we are not ready to test it with real data. and it also means, that we don't see any issues with that. All starts well, on with splunk-operator 2.4.0, and in the end of day - all stops as expected.

vivekr-splunk commented 2 weeks ago

@yaroslav-nakonechnikov can we close this issue