Open felfa01 opened 1 week ago
@chasewilson FYI, I have noticed this with the ACNS feature.
@felfa01 Thanks for reporting. we are looking into this.
Thanks @felfa01. We were able to reproduce the issue on our end and are working on a fix for it.
The issue is that there are two networkpolicies applied to the same endpoint. And one of those policy does not contain any dns rules. When we apply these 2 policies(in this case k6-enable-connection
with DNS rules and bad-netpol
without dns rules), cilium agent creates the DNS redirection for k6-enable-connection
and tries to resuse the same redirection for bad-netpol
policy. During policy recalculation since ACNS feature currently only supports DNS based policies, it starts failing the cilium agent because of dns policy being nil
for the bad-netpol
.
We came across this problem naturally running in one of our clusters. I can confirm this bug exists and the short-term fix is to remove the NetworkPolicy that has a DNS egress on it.
Confirming that the short term fix is to remove the NetworkPolicy that has a DNS egress specified.
Describe the bug When running an AKS cluster with Advanced Container Networking Services (ACNS) and deploying a
NetworkPolicy
configured to allow DNS egress, cilium agent pods are going into a crashing state.To Reproduce
NetworkPolicy
configured to allow egress to port 53 with protocol UDP.kubectl get pods -n kube-system
and see that cilium pods are in a crashing state:Environment (please complete the following information):
Additional context Error log: