Closed johnzheng1975 closed 5 months ago
Please guide how to make it works, thanks.
@mikkeloscar could you help to take a look, is this a defect? Thanks.
kubectl scale deployment aiservice --replicas=1 -n zone-dev can work, but will trigger one more rs. Not sure this is correct way to do.
The thing is: we are using argorollout, so replicas of deploy will be 0. Real replicas count is 1 (will be change with hpa) So, I think hpa should not show metrics is null. It should show as long as it can be queried from promtheus. Is this a defect of kube-metrics-adapter or a defect of hpa? Thanks.
or can you provide me some workarround, thanks.
here is k8s code for this: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/podautoscaler/horizontal.go#L821
Note that cpu/memory metrics will not raise this issue. Workaround is: Add a combination metrics, then desired replicas and current replicas will be 1. Then currentMetrics will not be null.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: aiservice
namespace: zone-dev
annotations:
metric-config.external.dcgm-fi-dev-gpu-util.prometheus/prometheus-server: http://prometheus-server.infra.svc/
metric-config.external.dcgm-fi-dev-gpu-util.prometheus/query: |
avg(
avg_over_time(
DCGM_FI_DEV_GPU_UTIL{
app="nvidia-dcgm-exporter",
container="service",
exported_namespace="zone-dev",
pod=~"aiservice-.*",
service="nvidia-dcgm-exporter"
}[1m]
)
)
spec:
scaleTargetRef:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
name: aiservice
minReplicas: 1
maxReplicas: 5
metrics:
- type: External
external:
metric:
name: dcgm-fi-dev-gpu-util
selector:
matchLabels:
type: prometheus
target:
type: AverageValue
averageValue: "50"
- resource:
name: memory
target:
averageUtilization: 95
type: Utilization
type: Resource
I think kube-metrics-adapter need improve for this issue. FYI
can you show the table view of the query?
DCGM_FI_DEV_GPU_UTIL{
app="nvidia-dcgm-exporter",
container="service",
exported_namespace="zone-dev",
pod=~"aiservice-.*",
service="nvidia-dcgm-exporter"
I checked your pictures and for me it looks like the labels are not matching.
Unrelated to the issue: One other small thing that you likely want to change is memory averageUtilization: 95
would mean that it scales-out only at +10%, which is 105%, which is likely already OOM.
Thanks for your answer. @szuecs Here is the result:
DCGM_FI_DEV_GPU_UTIL{DCGM_FI_DRIVER_VERSION="535.161.08", Hostname="ip-10-200-181-23.us-west-2.compute.internal", UUID="GPU-e1a61ba4-0fff-2b29-744f-110f9ca929cf", app="nvidia-dcgm-exporter", container="service", device="nvidia0", exported_namespace="zone-dev", gpu="0", instance="10.200.164.17:9400", job="kubernetes-service-endpoints", modelName="Tesla T4", namespace="infra", node="ip-10-200-181-23.us-west-2.compute.internal", pod="aiservice-84f444c7df-pw2jk", service="nvidia-dcgm-exporter"}
Value: 65
Thanks for your reminder.
Unrelated to the issue: One other small thing that you likely want to change is memory averageUtilization: 95 would mean that it scales-out only at +10%, which is 105%, which is likely already OOM.
Since the averageUtilization: 95 is based on request memory, it will not OOM if limit memory higher than request memory, am I right? Thanks.
Ok thanks the data looks good. Now I wonder a bit about if I understand the following correctly:
The thing is: we are using argorollout, so replicas of deploy will be 0. Real replicas count is 1 (will be change with hpa)
So, I think hpa should not show metrics is null. It should show as long as it can be queried from promtheus.
Is this a defect of kube-metrics-adapter or a defect of hpa? Thanks.
or can you provide me some workarround, thanks.
So if replicas are more than 0, everything works: Prometheus query returns data and kube-metrics-adapter is providing the data for the hpa, right? However argocd rollout will set the replica to 0 and then it breaks, right? And your expectation is that we would provide the last non zero data. Do I understand this correctly?
Thanks. @szuecs
So if replicas are more than 0, everything works: Prometheus query returns data and kube-metrics-adapter is providing the data for the hpa, right? Answer: Yes, if deployment replicas > 0, everything is fine.
However argocd rollout will set the replica to 0 and then it breaks, right? Answer: Because of argocd rollout, we have to set deploy replicas with 0
And your expectation is that we would provide the last non zero data. Do I understand this correctly? Answer: I expect:
No error message: message: the HPA controller was able to get the target's current scale
currentMetrics is not null
worked as another case in same enviroment as upper: istio-requests-total
Or worked as combination metrics https://github.com/zalando-incubator/kube-metrics-adapter/issues/724#issuecomment-2154390999
Note that for same deploy whose repolicas is 0,
@johnzheng1975 What is the output if you describe the hpa?
kubectl --namespace zone-dev describe hpa aiservice
The events in the bottom are the most interesting from that output.
@mikkeloscar , pls see upper
From what I understand is that the istio query will return no data if you scale down to zero. CPU and memory is not a prometheus query but some kubernetes internal metrics server lookup that could respond data from cache. I wonder a bit if 0 replicas that it returns non zero CPU/memory, but that seems to be a side effect that makes argocd work.
From my side it sounds like a bug in argocd to be honest. I personally would not like this controller to cache data and null/nil seems to be the right value for a query with no data.
@johnzheng1975 I wanted to see the events, you shared get hpa
output, I want to see describe hpa
.
@mikkeloscar @szuecs I found the root reason now. This is not a defect of kube-metrics-adapter
.
This is caused by incorrect configuration. Sorry for confuse I bring.
The wrong configuration is:
scaleTargetRef is deploy which replicator is 0
. It will bring the issue upper "scaling is disabled since the replica count of the target is zero"
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
annotations:
metric-config.external.dcgm-fi-dev-gpu-util.prometheus/prometheus-server: http://prometheus-server.infra.svc
metric-config.external.dcgm-fi-dev-gpu-util.prometheus/query: |
avg(
avg_over_time(
DCGM_FI_DEV_GPU_UTIL{
app="nvidia-dcgm-exporter",
container="service",
exported_namespace="zone-dev",
pod=~"aiservice-.*",
service="nvidia-dcgm-exporter"
}[1m]
)
)
creationTimestamp: "2024-06-06T12:37:33Z"
name: aiservice
namespace: zone-dev
resourceVersion: "68926327"
uid: f0e5f9cf-cc9e-4f60-b97f-0ad8a0727cfd
spec:
maxReplicas: 5
metrics:
- external:
metric:
name: dcgm-fi-dev-gpu-util
selector:
matchLabels:
type: prometheus
target:
averageValue: "50"
type: AverageValue
type: External
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: aiservice
The right configuration is changing scaleTargetRef from Deployment to Rollout, who replicas is 1 It works perfect.
Complete correct configuration:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
annotations:
metric-config.external.dcgm-fi-dev-gpu-util.prometheus/prometheus-server: http://prometheus-server.infra.svc
metric-config.external.dcgm-fi-dev-gpu-util.prometheus/query: |
avg(
avg_over_time(
DCGM_FI_DEV_GPU_UTIL{
app="nvidia-dcgm-exporter",
container="service",
exported_namespace="zone-dev",
pod=~"aiservice-.*",
service="nvidia-dcgm-exporter"
}[1m]
)
)
creationTimestamp: "2024-06-06T12:37:33Z"
name: aiservice
namespace: zone-dev
resourceVersion: "68926327"
uid: f0e5f9cf-cc9e-4f60-b97f-0ad8a0727cfd
spec:
maxReplicas: 5
metrics:
- external:
metric:
name: dcgm-fi-dev-gpu-util
selector:
matchLabels:
type: prometheus
target:
averageValue: "50"
type: AverageValue
type: External
minReplicas: 1
scaleTargetRef:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
name: aiservice
Let me close this ticket. Thanks for your excellent support. @szuecs @mikkeloscar
Expected Behavior
Works well as my another hpa, in same environment.
Actual Behavior
currentMetrics is null, hpa not work.
Steps to Reproduce the Problem
cd .\docs; kubectl apply -f .
Here are logs, seems already got the metrics
Here is prometheus, you can find the metrics is showed.
Specifications