Hi all, apologies if this is a configuration error, or if it is infact a Kubernetes error and not K3OS - but I have checked what I can.
Version (k3OS / kernel)
k3os version v0.21.5-k3s2r1
5.4.0-88-generic #99 SMP Tue Oct 5 16:53:38 UTC 2021
Architecture
x86_64
Describe the bug
I decided to spin up a new cluster, updating almost all versions of everything. My previous setup included K3OS running FluxCD and MetalLB amongst other things. The cluster itself is working fine, in so far as it is deploying and removing resources as expected, responding to the API, and FluxCD is making the amendments as expected, great, as per usual!
MetalLB version: 0.11.0
The issue is with the Linux Capability CAP_NET_RAW. The YAML includes the following on the container spec:
spec:
containers:
- args:
- --port=7472
- --config=config
- --log-level=debug
env:
- name: METALLB_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: METALLB_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: METALLB_ML_BIND_ADDR
valueFrom:
fieldRef:
fieldPath: status.podIP
# needed when another software is also using memberlist / port 7946
# when changing this default you also need to update the container ports definition
# and the PodSecurityPolicy hostPorts definition
#- name: METALLB_ML_BIND_PORT
# value: "7946"
- name: METALLB_ML_LABELS
value: "app=metallb,component=speaker"
- name: METALLB_ML_SECRET_KEY
valueFrom:
secretKeyRef:
name: memberlist
key: secretkey
image: docker.io/bitnami/metallb-speaker:0.11.0
name: speaker
ports:
- containerPort: 7472
name: monitoring
- containerPort: 7946
name: memberlist-tcp
- containerPort: 7946
name: memberlist-udp
protocol: UDP
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_RAW
drop:
- ALL
readOnlyRootFilesystem: true
hostNetwork: true
The problem is that the container doesnt seem to be granted the CAP_NET_RAW permission as per the YAML above.
This produce this issue from the logs of the MetalLB speaker container:
{"caller":"level.go:63","error":"creating ARP responder for \"eth0\": operation not permitted","interface":"eth0","level":"error","msg":"failed to create ARP responder","op":"createARPResponder","ts":"2021-12-21T12:04:04.608114098Z"}
And this of course results in MetalLB not functioning.
To double-check it has not been granted the capability, execing into the container and running the following:
cd /proc/1
cat status
...
CapPrm: 0000000000000000
CapEff: 0000000000000000
...
To Reproduce
Install K3OS at the version specified above, deploy MetalLB 0.11.0 as per default configuration from the provided YAML file:
Expected behavior
Adding CAP_NET_RAW adds the privilege to the container.
Actual behavior
No CAP_NET_RAW permissions granted, and no errors in k3s-service.log observered, or in the describe Pod.
Additional context
K3OS is running on a ProxMox VM, same as my previous cluster, which had older versions, but had no issue (Kube v1.18)
Its quite hard for me to discern if this is an issue, and even so, where it lies - i.e. is it the increased Kubernetes version, K3OS itself or MetalLB.
Ive kind of identified it is a permissions issue, as the capability it requires is not being applied, but why its not, I guess its either something else required for this version of Kubernetes, or something with K3OS.
Browsing the Kubernetes documentation does say that PodSecurityPolicy is deprecated in this release but wont be removed until Kubernetes v1.25, I guess I would have expected some error about the capabilities if something else were required and Ive not observed any.
Hi all, apologies if this is a configuration error, or if it is infact a Kubernetes error and not K3OS - but I have checked what I can.
Version (k3OS / kernel) k3os version v0.21.5-k3s2r1 5.4.0-88-generic #99 SMP Tue Oct 5 16:53:38 UTC 2021
Architecture x86_64
Describe the bug I decided to spin up a new cluster, updating almost all versions of everything. My previous setup included K3OS running FluxCD and MetalLB amongst other things. The cluster itself is working fine, in so far as it is deploying and removing resources as expected, responding to the API, and FluxCD is making the amendments as expected, great, as per usual!
MetalLB version: 0.11.0
The issue is with the Linux Capability CAP_NET_RAW. The YAML includes the following on the container spec:
The problem is that the container doesnt seem to be granted the CAP_NET_RAW permission as per the YAML above.
This produce this issue from the logs of the MetalLB speaker container:
{"caller":"level.go:63","error":"creating ARP responder for \"eth0\": operation not permitted","interface":"eth0","level":"error","msg":"failed to create ARP responder","op":"createARPResponder","ts":"2021-12-21T12:04:04.608114098Z"}
And this of course results in MetalLB not functioning.
To double-check it has not been granted the capability, execing into the container and running the following:
To Reproduce Install K3OS at the version specified above, deploy MetalLB 0.11.0 as per default configuration from the provided YAML file:
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/namespace.yaml kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/metallb.yaml
Expected behavior Adding CAP_NET_RAW adds the privilege to the container.
Actual behavior No CAP_NET_RAW permissions granted, and no errors in k3s-service.log observered, or in the describe Pod.
Additional context K3OS is running on a ProxMox VM, same as my previous cluster, which had older versions, but had no issue (Kube v1.18)
Its quite hard for me to discern if this is an issue, and even so, where it lies - i.e. is it the increased Kubernetes version, K3OS itself or MetalLB.
Ive kind of identified it is a permissions issue, as the capability it requires is not being applied, but why its not, I guess its either something else required for this version of Kubernetes, or something with K3OS.
Browsing the Kubernetes documentation does say that PodSecurityPolicy is deprecated in this release but wont be removed until Kubernetes v1.25, I guess I would have expected some error about the capabilities if something else were required and Ive not observed any.
Perhaps I therefore require a different method of allowing containers to grant themselves these permissions, but after some browsing it says it should still be functional (https://kubernetes.io/blog/2021/04/06/podsecuritypolicy-deprecation-past-present-and-future/)
Therefore this leads me to at least consider this being something K3OS related.