kubernetes / autoscaler

Autoscaling components for Kubernetes
Apache License 2.0
8.05k stars 3.97k forks source link

[VPA] Pod creation fails if VPA has only one controlledResources #3902

Closed Ladicle closed 3 years ago

Ladicle commented 3 years ago

Which component are you using?: Vertical Pod Autoscaler

What version of the component are you using?:

Component version: 0.9.2

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-14T05:14:17Z", GoVersion:"go1.15.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.1", GitCommit:"c4d752765b3bbac2237bf87cf0b1c2e307844666", GitTreeState:"clean", BuildDate:"2020-12-18T12:00:47Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

What environment is this in?:

both minikube & on-prem k8s cluster

What did you expect to happen?:

Deployment with limits can be controlled only one resource by VPA.

What happened instead?:

VPA resource only has cpu as a controlledResource, but a vpa-admission-controller sends patch for non controlled resource limits and the value is 0. (like this{add /spec/containers/0/resources/limits/memory 0}) For that reason, replicaset-controller gets FailedCreate error.

vpa-admission-controller patch log:

vpa-admission-controller-6cd546c4f-zckmj admission-controller I0225 05:53:42.693154       1 server.go:110] Sending patches: [{add /metadata/annotations map[]} {add /spec/containers/0/resources/requests/cpu 587m} {add /spec/containers/0/resources/limits/cpu 5870m} {add /spec/containers/0/resources/limits/memory 0} {add /metadata/annotations/vpaUpdates Pod resources updated by hamster-vpa: container 0: cpu request, cpu limit, memory limit} {add /metadata/annotations/vpaObservedContainers hamster}

replicaset-controller warning event:

tainers[0].resources.requests: Invalid value: "256Mi": must be less than or equal to memory limit
0s          Warning   FailedCreate        replicaset/hamster-5d669c8b66   Error creating: Pod "hamster-5d669c8b66-7hphs" is invalid: spec.containers[0].resources.requests: Invalid value: "256Mi": must be less than or equal to memory limit

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

  1. apply the following manifests
  2. scale-out its replicas
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
  name: hamster-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: hamster
  resourcePolicy:
    containerPolicies:
      - containerName: '*'
        controlledResources: ["cpu"]
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hamster
spec:
  selector:
    matchLabels:
      app: hamster
  replicas: 2
  template:
    metadata:
      labels:
        app: hamster
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534 # nobody
      containers:
        - name: hamster
          image: k8s.gcr.io/ubuntu-slim:0.1
          resources:
            requests:
              cpu: 100m
              memory: 256Mi
            limits:
              cpu: 1
              memory: 1Gi
          command: ["/bin/sh"]
          args:
            - "-c"
            - "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done"

In this example, cpu is set to controlledResource, but the same thing happens with memory.

Vaproc commented 3 years ago

I do have the same issue with v0.9.2. I've seen your PR. When will this be released?

mshade commented 2 years ago

This issue persists on 0.9.2 - can we reopen and/or get an idea of when #3903 will be released in VPA?

arnavc1712 commented 2 years ago

Hi, I have been getting the same issue. Has there been any fix for this merged into master yet. Does this error persist with other versions of the VPA?

jbartosik commented 2 years ago

@arnavc1712 @mshade fix for this #3903 is part of 0.10.0 release (you can check blame on this lines)