Open QxBytes opened 5 months ago
securityContext:
capabilities:
drop:
- ALL
add:
- NET_ADMIN # only necessary for delegated IPAM/Cilium
- NET_RAW # only necessary for delegated IPAM/Cilium
and make sure that we have a test that's rebooting the Node and verifying CNS functionality afterwards
https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#configure-volume-permission-and-ownership-change-policy-for-pods sounds like it would let the k8s chmod the directory when the CNS Pod mounts it as a Volume?
This issue is stale because it has been open for 2 weeks with no activity. Remove stale label or comment or this will be closed in 7 days
This issue is stale because it has been open for 2 weeks with no activity. Remove stale label or comment or this will be closed in 7 days
This issue is stale because it has been open for 2 weeks with no activity. Remove stale label or comment or this will be closed in 7 days
Node Rebooting Test added in: #2901
Backport PR can be created for release/v1.5
, but the validate package within release/v1.4
needs to be updated to consume the work. Will leave this open until both have been merged
This issue is stale because it has been open for 2 weeks with no activity. Remove stale label or comment or this will be closed in 7 days
@QxBytes looks like this was fully backported, is it good to resolve?
release/v1.4
as the testing relies on the validation package being updated. Leaving this open until that work is complete and all E2E within this release train can validate Linux properly.This issue is stale because it has been open for 2 weeks with no activity. Remove stale label or comment or this will be closed in 7 days
Issue closed due to inactivity.
What happened: On the first boot, no CNI binary is on the node, and so k8s creates the /var/run/azure-vnet directory with 0755 permissions automatically because it is a mount part of the azure-cns daemonset. Then the CNI is deployed. The /var/run directory is not preserved between reboots. Then, when the VM reboots, the CNI binary may run before k8s creates the /var/run/azure-vnet directory. When the CNI binary runs first, it creates the directory with 0644 permissions. This causes permission denied errors for the cns. Even if k8s creates/mounts the /var/run/azure-vnet directory later, it will see it already exists and won't recreate the directory with the 0755 permissions.
What you expected to happen:
The CNI binary should create the directory with 0755 permissions.
How to reproduce it:
Reboot the VM with the cns capabilities security context dropping all capabilities (so it doesn't bypass permission checks). There is a chance that the azure-cns pod will get stuck in crash loop backoff.
Orchestrator and Version (e.g. Kubernetes, Docker):
Operating System (Linux/Windows):
Kernel (e.g.
uanme -a
for Linux or$(Get-ItemProperty -Path "C:\windows\system32\hal.dll").VersionInfo.FileVersion
for Windows):Anything else we need to know?: [Miscellaneous information that will assist in solving the issue.]