Closed thdonatello closed 1 year ago
Hey @thdonatello,
The VPA has specifics in its code for the built-in Kinds like Deployment
and StatefulSet
, but it works with all kinds of custom resources as well. The only contract is that the Kind referenced via targetRef
has to provide the /scale
subresource. The VPA doesn't use this resource for scaling, though, it just needs the labelSelectorPath
to identify which Pods belong to this custom resource. Please note that the Kind in targetRef
needs to be the topmost owner of the resource, otherwise you will get an error message (e.g. you cannot reference a ReplicaSet
, if it is owned by a Deployment
. Not sure what the hierarchy of resources is with the Strimzi operators, you will know this much better than I do.
TL;DR: There's nothing to be done from the VPA side, this can be enabled purely on the side of the custom resource you're using.
As i can see, StrimziPodSet
is a set of pods f.e.:
#k get strimzipodsets.core.strimzi.io stage-kafka -o yaml
apiVersion: core.strimzi.io/v1beta2
kind: StrimziPodSet
metadata:
annotations:
strimzi.io/kafka-version: 3.4.0
strimzi.io/storage: '{"type":"persistent-claim","size":"50Gi"}'
creationTimestamp: "2023-04-24T13:32:00Z"
generation: 4
labels:
app.kubernetes.io/instance: stage
app.kubernetes.io/managed-by: strimzi-cluster-operator
app.kubernetes.io/name: kafka
app.kubernetes.io/part-of: strimzi-stage
strimzi.io/cluster: stage
strimzi.io/component-type: kafka
strimzi.io/kind: Kafka
strimzi.io/name: stage-kafka
name: stage-kafka
namespace: kafka
ownerReferences:
- apiVersion: kafka.strimzi.io/v1beta2
blockOwnerDeletion: false
controller: false
kind: Kafka
name: stage
uid: 18bf04f7-9d0e-4ecb-9d8e-a211bc84a464
resourceVersion: "1320832060"
uid: ea00fb82-cb4b-46cf-a552-e6469eff9c71
spec:
pods:
- apiVersion: v1
kind: Pod
metadata:
annotations:
strimzi.io/broker-configuration-hash: 54f3f3bd
strimzi.io/clients-ca-cert-generation: "2"
strimzi.io/cluster-ca-cert-generation: "2"
strimzi.io/inter-broker-protocol-version: 3.4.0
strimzi.io/kafka-version: 3.4.0
strimzi.io/log-message-format-version: 3.4.0
strimzi.io/logging-appenders-hash: e893ac9f
strimzi.io/revision: addf82a7
strimzi.io/server-cert-hash: 189578ed483d0cc8094f0fb6ec369551974d77ef
labels:
app.kubernetes.io/instance: stage
app.kubernetes.io/managed-by: strimzi-cluster-operator
app.kubernetes.io/name: kafka
app.kubernetes.io/part-of: strimzi-stage
statefulset.kubernetes.io/pod-name: stage-kafka-0
strimzi.io/cluster: stage
strimzi.io/component-type: kafka
strimzi.io/controller: strimzipodset
strimzi.io/controller-name: stage-kafka
strimzi.io/kind: Kafka
strimzi.io/name: stage-kafka
strimzi.io/pod-name: stage-kafka-0
name: stage-kafka-0
namespace: kafka
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kafka
operator: In
values:
- "true"
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
strimzi.io/name: stage-kafka
topologyKey: topology.kubernetes.io/zone
weight: 100
containers:
- args:
- /opt/kafka/kafka_run.sh
env:
- name: KAFKA_METRICS_ENABLED
value: "true"
- name: STRIMZI_KAFKA_GC_LOG_ENABLED
value: "false"
- name: KAFKA_HEAP_OPTS
value: -Xms128M
image: quay.io/strimzi/kafka:0.34.0-kafka-3.4.0
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- /opt/kafka/kafka_liveness.sh
initialDelaySeconds: 15
timeoutSeconds: 5
name: kafka
ports:
- containerPort: 9090
name: tcp-ctrlplane
protocol: TCP
- containerPort: 9091
name: tcp-replication
protocol: TCP
- containerPort: 9092
name: tcp-clients
protocol: TCP
- containerPort: 9094
name: tcp-external
protocol: TCP
- containerPort: 9404
name: tcp-prometheus
protocol: TCP
readinessProbe:
exec:
command:
- /opt/kafka/kafka_readiness.sh
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- mountPath: /var/lib/kafka/data
name: data
- mountPath: /tmp
name: strimzi-tmp
- mountPath: /opt/kafka/cluster-ca-certs
name: cluster-ca
- mountPath: /opt/kafka/broker-certs
name: broker-certs
- mountPath: /opt/kafka/client-ca-certs
name: client-ca-cert
- mountPath: /opt/kafka/custom-config/
name: kafka-metrics-and-logging
- mountPath: /var/opt/kafka
name: ready-files
hostname: stage-kafka-0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 0
serviceAccountName: stage-kafka
subdomain: stage-kafka-brokers
terminationGracePeriodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: data-stage-kafka-0
- emptyDir:
medium: Memory
sizeLimit: 5Mi
name: strimzi-tmp
- name: cluster-ca
secret:
defaultMode: 292
secretName: stage-cluster-ca-cert
- name: broker-certs
secret:
defaultMode: 292
secretName: stage-kafka-brokers
- name: client-ca-cert
secret:
defaultMode: 292
secretName: stage-clients-ca-cert
- configMap:
name: stage-kafka-0
name: kafka-metrics-and-logging
- emptyDir:
medium: Memory
sizeLimit: 1Ki
name: ready-files
- apiVersion: v1
kind: Pod
metadata:
annotations:
strimzi.io/broker-configuration-hash: 85a2c59e
strimzi.io/clients-ca-cert-generation: "2"
strimzi.io/cluster-ca-cert-generation: "2"
strimzi.io/inter-broker-protocol-version: 3.4.0
strimzi.io/kafka-version: 3.4.0
strimzi.io/log-message-format-version: 3.4.0
strimzi.io/logging-appenders-hash: e893ac9f
strimzi.io/revision: 188c1917
strimzi.io/server-cert-hash: 025f7f8e1597d0e14b58f09079461da217e1a22c
labels:
app.kubernetes.io/instance: stage
app.kubernetes.io/managed-by: strimzi-cluster-operator
app.kubernetes.io/name: kafka
app.kubernetes.io/part-of: strimzi-stage
statefulset.kubernetes.io/pod-name: stage-kafka-1
strimzi.io/cluster: stage
strimzi.io/component-type: kafka
strimzi.io/controller: strimzipodset
strimzi.io/controller-name: stage-kafka
strimzi.io/kind: Kafka
strimzi.io/name: stage-kafka
strimzi.io/pod-name: stage-kafka-1
name: stage-kafka-1
namespace: kafka
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kafka
operator: In
values:
- "true"
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
strimzi.io/name: stage-kafka
topologyKey: topology.kubernetes.io/zone
weight: 100
containers:
- args:
- /opt/kafka/kafka_run.sh
env:
- name: KAFKA_METRICS_ENABLED
value: "true"
- name: STRIMZI_KAFKA_GC_LOG_ENABLED
value: "false"
- name: KAFKA_HEAP_OPTS
value: -Xms128M
image: quay.io/strimzi/kafka:0.34.0-kafka-3.4.0
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- /opt/kafka/kafka_liveness.sh
initialDelaySeconds: 15
timeoutSeconds: 5
name: kafka
ports:
- containerPort: 9090
name: tcp-ctrlplane
protocol: TCP
- containerPort: 9091
name: tcp-replication
protocol: TCP
- containerPort: 9092
name: tcp-clients
protocol: TCP
- containerPort: 9094
name: tcp-external
protocol: TCP
- containerPort: 9404
name: tcp-prometheus
protocol: TCP
readinessProbe:
exec:
command:
- /opt/kafka/kafka_readiness.sh
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- mountPath: /var/lib/kafka/data
name: data
- mountPath: /tmp
name: strimzi-tmp
- mountPath: /opt/kafka/cluster-ca-certs
name: cluster-ca
- mountPath: /opt/kafka/broker-certs
name: broker-certs
- mountPath: /opt/kafka/client-ca-certs
name: client-ca-cert
- mountPath: /opt/kafka/custom-config/
name: kafka-metrics-and-logging
- mountPath: /var/opt/kafka
name: ready-files
hostname: stage-kafka-1
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 0
serviceAccountName: stage-kafka
subdomain: stage-kafka-brokers
terminationGracePeriodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: data-stage-kafka-1
- emptyDir:
medium: Memory
sizeLimit: 5Mi
name: strimzi-tmp
- name: cluster-ca
secret:
defaultMode: 292
secretName: stage-cluster-ca-cert
- name: broker-certs
secret:
defaultMode: 292
secretName: stage-kafka-brokers
- name: client-ca-cert
secret:
defaultMode: 292
secretName: stage-clients-ca-cert
- configMap:
name: stage-kafka-1
name: kafka-metrics-and-logging
- emptyDir:
medium: Memory
sizeLimit: 1Ki
name: ready-files
- apiVersion: v1
kind: Pod
metadata:
annotations:
strimzi.io/broker-configuration-hash: 8c61ba13
strimzi.io/clients-ca-cert-generation: "2"
strimzi.io/cluster-ca-cert-generation: "2"
strimzi.io/inter-broker-protocol-version: 3.4.0
strimzi.io/kafka-version: 3.4.0
strimzi.io/log-message-format-version: 3.4.0
strimzi.io/logging-appenders-hash: e893ac9f
strimzi.io/revision: 2704ff05
strimzi.io/server-cert-hash: 4928985a6737d4d87461540e376fdb5988e07575
labels:
app.kubernetes.io/instance: stage
app.kubernetes.io/managed-by: strimzi-cluster-operator
app.kubernetes.io/name: kafka
app.kubernetes.io/part-of: strimzi-stage
statefulset.kubernetes.io/pod-name: stage-kafka-2
strimzi.io/cluster: stage
strimzi.io/component-type: kafka
strimzi.io/controller: strimzipodset
strimzi.io/controller-name: stage-kafka
strimzi.io/kind: Kafka
strimzi.io/name: stage-kafka
strimzi.io/pod-name: stage-kafka-2
name: stage-kafka-2
namespace: kafka
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kafka
operator: In
values:
- "true"
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
strimzi.io/name: stage-kafka
topologyKey: topology.kubernetes.io/zone
weight: 100
containers:
- args:
- /opt/kafka/kafka_run.sh
env:
- name: KAFKA_METRICS_ENABLED
value: "true"
- name: STRIMZI_KAFKA_GC_LOG_ENABLED
value: "false"
- name: KAFKA_HEAP_OPTS
value: -Xms128M
image: quay.io/strimzi/kafka:0.34.0-kafka-3.4.0
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- /opt/kafka/kafka_liveness.sh
initialDelaySeconds: 15
timeoutSeconds: 5
name: kafka
ports:
- containerPort: 9090
name: tcp-ctrlplane
protocol: TCP
- containerPort: 9091
name: tcp-replication
protocol: TCP
- containerPort: 9092
name: tcp-clients
protocol: TCP
- containerPort: 9094
name: tcp-external
protocol: TCP
- containerPort: 9404
name: tcp-prometheus
protocol: TCP
readinessProbe:
exec:
command:
- /opt/kafka/kafka_readiness.sh
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- mountPath: /var/lib/kafka/data
name: data
- mountPath: /tmp
name: strimzi-tmp
- mountPath: /opt/kafka/cluster-ca-certs
name: cluster-ca
- mountPath: /opt/kafka/broker-certs
name: broker-certs
- mountPath: /opt/kafka/client-ca-certs
name: client-ca-cert
- mountPath: /opt/kafka/custom-config/
name: kafka-metrics-and-logging
- mountPath: /var/opt/kafka
name: ready-files
hostname: stage-kafka-2
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 0
serviceAccountName: stage-kafka
subdomain: stage-kafka-brokers
terminationGracePeriodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: data-stage-kafka-2
- emptyDir:
medium: Memory
sizeLimit: 5Mi
name: strimzi-tmp
- name: cluster-ca
secret:
defaultMode: 292
secretName: stage-cluster-ca-cert
- name: broker-certs
secret:
defaultMode: 292
secretName: stage-kafka-brokers
- name: client-ca-cert
secret:
defaultMode: 292
secretName: stage-clients-ca-cert
- configMap:
name: stage-kafka-2
name: kafka-metrics-and-logging
- emptyDir:
medium: Memory
sizeLimit: 1Ki
name: ready-files
selector:
matchLabels:
strimzi.io/cluster: stage
strimzi.io/kind: Kafka
strimzi.io/name: stage-kafka
status:
currentPods: 3
observedGeneration: 4
pods: 3
readyPods: 3
It doesn't have labelSelectorPath
, it just create differents kind: pod
.
How can I use VPA Recommender with kind: pod
?
@voelzmo , Initial design document had LabelSelector in the VPA spec, but at some moment in time it was replaced by targetRef.
I can understand appeal of targetRef, but please consider restoring LabelSelector as alternative way to discover pods. Not every CRD has labelSelector in the /scale
subresource and some of them can't have one as it is meaningless to them, because they use "internal" labels not controlled by users to track pods.
Hey @redbaron and @thdonatello
I may have been not specific enough with my explanations above, sorry! What you're looking for is the /scale
subresource, which you can access with kubectl
like this
k get rollout hamster --subresource=scale -oyaml
apiVersion: autoscaling/v1
kind: Scale
metadata:
creationTimestamp: "2023-04-05T08:23:41Z"
name: hamster
namespace: default
resourceVersion: "6880972"
uid: c32c42bc-f297-4a33-b4bb-22aeb48ecbef
spec:
replicas: 2
status:
replicas: 2
selector: app=hamster,rollouts-pod-template-hash=74d4fbbc7f
The field that the VPA cares about is status.selector
in the returned json and this is defined in the CRD by subresources.scale.labelSelectorPath
, like mentioned in the documentation I linked above.
You are correct that the labelSelector
is no longer part of the VPA spec, instead it gets the selector now from the scale subresource, which in turn means that you have to use a targetRef
that implements the scale subresource and provides the selector. Does this make sense?
@voelzmo , /scale
subresource has to be implemented by CRD , not every CRD which manages pods has it, for instance Cloudnative PG doesn't: https://github.com/cloudnative-pg/cloudnative-pg/blob/main/releases/cnpg-1.20.1.yaml#L4337C4-L4340
The reason they don't have it is because it makes no sense for them: operators for these CRDs manage pods without user-provided label selector, unlike Deployment or DaemonSet.
VPA without labelSelector cannot work with these CRDs
VPA without labelSelector cannot work with these CRDs
That's precisely what I'm saying above:
The only contract is that the Kind referenced via targetRef has to provide the /scale subresource. (...) TL;DR: There's nothing to be done from the VPA side, this can be enabled purely on the side of the custom resource you're using.
My understanding is that you're here to request an alternative way to specify the target for a VPA, which is different from what the original author was asking for (as far as I understood, the Strimzi components do implement the /scale
subresource). Can you please open a feature request so we can split this discussion out and close this issue once it is resolved?
the Strimzi components do implement the /scale subresource
It is not enough just to have /scale
it needs to be /scale
with labelSelectorPath
defined, both are optional for CRDs and makes them incompatible with VPA which is unnecesasry limitation.
Also I don't use Strimzi, but it looks to me it doesn't support /scale
in any form at all: https://github.com/strimzi/strimzi-kafka-operator/blob/0d11bcd8a7b9bdea22dfa47d310649bbb8d5f528/api/src/main/java/io/strimzi/api/kafka/Crds.java#L166C14-L181
Plus in the opening post there is an example of targetRef
use with StrimziPodSet
and I assume it didn't work for issue author.
Thank you for your responses.
I don't have /scale
subresources in my k8s. I only use recommender.
I did some research related to autoscaing Kafka with Strimzi and found this from the maintainers: https://github.com/orgs/strimzi/discussions/6635
It seems they don't think you can/should autoscale the Kafka resource with HPA or VPA but you can apply it to a few of the other custom resources, which also implement the /scale
subresource correctly. I'm closing this issue, because I think the Strimzi related questions should be answered, please re-open or let me know if that's not the case. Thanks!
/close /remove-kind feature /kind support
@voelzmo: Closing this issue.
@voelzmo Yes, auto-scaling Kafka is complicated. But we use VPA not for scaling. In our environments our teams can create kafka clusters and manage theirs resources. Unfortunately they request over9k CPU/RAM and it cost too much.
We use VPA (recommender) for compare requests against recommendations and sand warnings to the team if the are incorrect. But we can't use targetRef StrimziPodSet
in our case and we don't have another kind
.
Which component are you using?: recommender
Is your feature request designed to solve a problem? If so describe the problem this feature should solve.: I use recommender for seeing recommended resources for my services. It's very helpful but i can't get recommends for kafka pods because they deployed as a
kind: StrimziPodSet
(with Strimzi Kafka Operator).It will be very cool i think to add
kind: StrimziPodSet
like: