Open grzesuav opened 2 months ago
Original manifest from cluster below, our mutation webhook is adding
- key: non-existing-key
operator: Exists
to prevent scheduling.
❯ k get daemonsets.apps -n kube-system retina-agent -o yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
annotations:
deprecated.daemonset.template.generation: "1"
meta.helm.sh/release-name: aks-managed-kappie
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2024-08-24T14:13:08Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
helm.toolkit.fluxcd.io/name: kappie-adapter-helmrelease
helm.toolkit.fluxcd.io/namespace: 64f7349df0994400019581c9
k8s-app: retina
kubernetes.azure.com/managedby: aks
name: retina-agent
namespace: kube-system
resourceVersion: "839996452"
uid: 1b9ce971-598a-4d23-b43f-9f2db17b8036
spec:
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: retina
template:
metadata:
annotations:
prometheus.io/port: "10093"
prometheus.io/scrape: "true"
creationTimestamp: null
labels:
k8s-app: retina
kubernetes.azure.com/managedby: aks
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.azure.com/cluster
operator: Exists
- key: kubernetes.azure.com/os-sku
operator: NotIn
values:
- CBLMariner
- key: kubernetes.azure.com/ebpf-dataplane
operator: NotIn
values:
- cilium
- key: type
operator: NotIn
values:
- virtual-kubelet
- key: kubernetes.io/os
operator: In
values:
- linux
- key: non-existing-key
operator: Exists
containers:
- args:
- --health-probe-bind-address=:18081
- --metrics-bind-address=:18080
- --config
- /kappie/config/config.yaml
command:
- /kappie/controller
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: NODE_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
image: mcr.microsoft.com/containernetworking/kappie-agent:v0.1.4
imagePullPolicy: IfNotPresent
name: retina
ports:
- containerPort: 10093
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /metrics
port: 10093
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: 500m
memory: 300Mi
requests:
cpu: 100m
memory: 200Mi
securityContext:
capabilities:
add:
- SYS_ADMIN
- SYS_RESOURCE
- NET_ADMIN
- NET_RAW
- IPC_LOCK
drop:
- ALL
privileged: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /sys/kernel/debug
name: debug
- mountPath: /sys/kernel/tracing
name: trace
- mountPath: /sys/fs/bpf
name: bpf
- mountPath: /sys/fs/cgroup
name: cgroup
- mountPath: /tmp
name: tmp
- mountPath: /kappie/config
name: config
dnsPolicy: ClusterFirst
hostNetwork: true
initContainers:
- image: mcr.microsoft.com/containernetworking/kappie-init:v0.1.4
imagePullPolicy: IfNotPresent
name: init-retina
resources: {}
securityContext:
capabilities:
drop:
- ALL
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /sys/fs/bpf
mountPropagation: Bidirectional
name: bpf
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: retina-agent
serviceAccountName: retina-agent
terminationGracePeriodSeconds: 30
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- effect: NoExecute
operator: Exists
- effect: NoSchedule
operator: Exists
volumes:
- hostPath:
path: /sys/kernel/debug
type: ""
name: debug
- hostPath:
path: /sys/kernel/tracing
type: ""
name: trace
- hostPath:
path: /sys/fs/bpf
type: ""
name: bpf
- hostPath:
path: /sys/fs/cgroup
type: ""
name: cgroup
- emptyDir: {}
name: tmp
- configMap:
defaultMode: 420
name: retina-config
name: config
updateStrategy:
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
type: RollingUpdate
status:
currentNumberScheduled: 0
desiredNumberScheduled: 0
numberMisscheduled: 0
numberReady: 0
observedGeneration: 1
Hi @grzesuav Does this cluster have ama metrics enabled? https://learn.microsoft.com/en-us/azure/aks/network-observability-managed-cli?tabs=newer-k8s-versions#create-cluster
Retina-agent will be installed on clusters with ama-metrics and k8s versions >= 1.29
If you have the aks-preview cli, you can enable/disable ama-metrics with az aks update --disable-azure-monitor-metrics --name <cluster-name> --resource-group <resource-group>
other documentation on monitoring can be found here too https://learn.microsoft.com/en-us/azure/azure-monitor/containers/kubernetes-monitoring-enable?tabs=cli
Hi, I want to keep control plane metrics, is there an option to only remove retina ?
Also, since it is degrading node networking why it is enabled automatically?
Since retina is bundled with ama-metrics now for k8s 1.29, as of now, the ways to disable retina-agent is to disable monitoring. Retina team is currently investigating any perf issues regarding this
We have tried to repro this internally, in multiple tests we were able to repo ~20% drop for INTRA node traffic (between pods running on the same node) and almost negligible difference for INTER node traffic (between pods running on different nodes).
In OSS retina we are working on a performance pipeline where the tests can be public. https://github.com/microsoft/retina/issues/655
Describe the bug After upgrade of control plane we noticed that on some cluster we have degraded network throughput on nodes. Some network intensive pods were suffering from lack of badwitch. After some experiment it was pinpointed to
retina-agent
being suddently installed on our clusters.To Reproduce
Expected behavior A clear and concise description of what you expected to happen.
Screenshots Lower
Environment (please complete the following information):
Additional context Add any other context about the problem here. Pod network throughput. ON the drop retina agent was added to the nodes, after removal network throuput was restored to original speed.
Currently we cannot remove just retina agent in AKS official way (or I am not aware how to do it) - I am usnure why it was installed in first place after upgrading clusters to 1.29
As a workaround we use admission webhook to add nonexisting selector for nodes which prevent pods from being scheduled, but I want to :