Closed wilsonianb closed 3 years ago
@williamsandrew thanks for reporting this! I ran into this as well. I figure it would be useful to be able to reproduce this for anyone that knows enough to try resolve this, and here is how.
Ubuntu 20.04, k3s v1.17.4+k3s1 (3eee8ac3)
with --docker
, and docker version 19.03.8
.
The busybox-without-access had access, and then after waitin for two minutes, the busybox-with-access failed to get access during the initial ~40 seconds, and then got access.
I see two issues which may relate:
--docker
flag)The bug reproduced in the same way.
We are using a an implementation called kube-router (https://github.com/cloudnativelabs/kube-router/blob/856c7d762a73df557e0a1d35721f48fe8ba7925d/pkg/controllers/netpol/network_policy_controller.go#L58) which includes a syncPeriod as part of it's operation, please feel free to open an issue there if it makes sense.
Thanks. It looks like there's already a related issue: https://github.com/cloudnativelabs/kube-router/issues/873
I just realized (even though it's mentioned in the kube-router issue I linked above) that this does not appear to be an issue with every network policy rule. For example, there does not appear to be a delay when an ingress rule restricts a new pod from connecting to another pod. (However, there may be a delay in the rule permitting a new pod to connect to another existing one.)
Reproduced the issue in k3s version v1.17.4+k3s1
kubectl get networkpolicy
NAME POD-SELECTOR AGE
foo-deny-egress app=foo 45s
/ # wget -qO- --timeout 1 http://web:80/
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
Validated on k3s v1.20.0-rc5+k3s1, there is no delay in policy enforcement. Labeled pod does not succeed after pod startup when deny egress traffic is applied
It looks like there is a --iptables-sync-period
option on kube-router: https://www.kube-router.io/docs/user-guide/
If I'm understanding the description correctly, it sounds like changing this value could lower the amount of time it takes kube-router to enforce NetworkPolicy changes. Is it possible to make this option configurable in k3s?
Not at the moment. With recent updates to the network policy controller, the out of the box settings should be sufficient.
Thanks Brandon, I didn't catch the end of Shylaja's message above. I verified that there is no NetworkPolicy delay on k3d v4.3.0 with k3s v1.20.4-k3s1 👍
It seems like this is still happening on recent versions.
Note: Depending on your CNI for cluster network, there might be some delay when Kubernetes applying network policies to the pod. This delay may fail Longhorn recurring job for taking Snapshot or Backup of the Volume since it cannot access longhorn-manager in the beginning. This is a known issue found in K3s with Traefik and is beyond Longhorn control. in their documentation.
kube-system
and pods in the same namespace
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
namespace: mynamespace
name: deny-from-other-namespaces
spec:
podSelector:
matchLabels:
ingress:
- from:
- podSelector: {}
- namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values: ["kube-system"]
mynamespace
, create a job that connects to another pod in the same namespace. The connection will be refused.initContainers:
- name: sleep
image: alpine:latest
command: ["sleep", "5s"]
This is a limitation of network policies. Sync of policy rules is not instantaneous and does not block pod startup. Kubernetes is build around the concept of eventual consistency. Any component that has a dependency on the state of another component should be able to retry as necessary until that component becomes available - or reachable in this case.
Version:
k3s version v0.10.0 (f9888ca3)
Describe the bug There appears to be period of time (<1 minute) after a pod is started during which applicable network policies are not enforced.
To Reproduce Steps to reproduce the behavior: https://github.com/ahmetb/kubernetes-network-policy-recipes/blob/master/11-deny-egress-traffic-from-an-application.md
Expected behavior All egress requests (via
wget
) from theapp=foo
labeled pod should fail with eitherbad address
ordownload timed out
(depending on the network policy in use).Actual behavior Egress requests (via
wget
) from theapp=foo
labeled pod initially succeed after pod startup before eventually failing as expected.Additional context