Closed nathan-bowman closed 2 months ago
I'm not entirely sure if this will mess things up, but I got the autoscaling to work by pointing the HPA at the IndexerCluster's downstream statefulset:
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: idx-cluster-autoscaler
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: splunk-idx-cluster-indexer
minReplicas: 3
maxReplicas: 15
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 25
Is this correct?
Going down the rabbit hole...
It looks like my HPA isn't gathering metrics for the target:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
idx-cluster-autoscaler IndexerCluster/idx-cluster <unknown>/25% 3 15 3 18h
Additional digging into the metric-server shows lots of scrape errors:
E0717 23:45:19.725688 1 scraper.go:149] "Failed to scrape node" err="Get \"https://192.168.28.96:10250/metrics/resource\": remote error: tls: internal error" node="ip-192-168-28-96.us-west-2.compute.internal"
I think this is related to a recent issue posted in the official metrics-server repo, and an associated PR.
I'm not totally sure, though... Other HPA's in my EKS clusters seem to work fine...
Edit: To clarify, the HPA sees the targets data when I point it towards the statefulset, but not when I point it towards kind: IndexerCluster
Adding more info...
# kubectl --raw /apis/enterprise.splunk.com/v4/ | jq '.resources[] | select(.name=="indexerclusters/scale")'
{
"name": "indexerclusters/scale",
"singularName": "",
"namespaced": true,
"group": "autoscaling",
"version": "v1",
"kind": "Scale",
"verbs": [
"get",
"patch",
"update"
]
}
is v1
pointing to autoscaling? I'm using autoscaling/v2 in my HPA
I tried autoscaling/v1
and have the same issue 👎
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
autoscaling.alpha.kubernetes.io/conditions: >-
[{"type":"AbleToScale","status":"False","lastTransitionTime":"2024-07-18T19:46:25Z","reason":"FailedGetScale","message":"the
HPA controller was unable to get the target's current scale: Internal
error occurred: the spec replicas field \".spec.replicas\" does not
exist"}]
creationTimestamp: '2024-07-18T19:46:10Z'
labels:
app.kubernetes.io/instance: backend-staging-splunk-enterprise
name: idx-cluster-autoscaler
namespace: splunk-enterprise
resourceVersion: '171496564'
uid: e655b8c2-a04a-48c6-b882-1b4bd29fa2f4
spec:
maxReplicas: 15
minReplicas: 3
scaleTargetRef:
apiVersion: enterprise.splunk.com/v4
kind: IndexerCluster
name: idx-cluster
targetCPUUtilizationPercentage: 25
status:
currentReplicas: 0
desiredReplicas: 0
I worked with Splunk support on this, and they suggest that despite Kubernetes docs saying otherwise, you must hardcode .spec.replicas
on the CR in order to get it working.
Since I use ArgoCD, I had to set ignoreDifferences on the CR to stop it from showing up as out of sync.
CSPL-2819
Please select the type of request
Bug
Tell us more
Describe the Problem I'm following the details here for pod autoscaling. It seems that
spec.replicas
is a mandatory field, but with the HorizontalPodAutoscaler docs recommend that you removespec.replicas
from the target manifest.When an HPA is enabled, it is recommended that the value of spec.replicas of the Deployment and / or StatefulSet be removed from their manifest(s).
Error I receive when I remove
spec.replicas
:the HPA controller was unable to get the target's current scale: Internal error occurred: the spec replicas field ".spec.replicas" does not exist
Expected behavior One should be able to remove
spec.replicas
from the Splunk CRindexerclusters.enterprise.splunk.com
(and probably other CRs...) to allow HorizontalPodAutoscaler to manage thespec.replicas
.Splunk setup on K8S AWS EKS v1.29, with Splunk Operator 2.5.2.
Last thing to note, I'm using
autoscaling/v2
apiVersionReproduction/Testing steps idx-cluster.yaml:
HorizontalPodAutoscaler yaml:
K8s environment k8s v1.29