Closed HaveFun83 closed 4 years ago
Was your nginx pod setup with the DSR annotation by any chance?
Closing as stale.
@murali-reddy We recently had a message in slack that brought this issue back up. I'm able to reproduce it myself by applying a network policy to a pod and then bouncing traffic to it through another node.
Here is the setup, context, and tcpdump courtesy of nexus
in Slack:
Environment context:
Client Host: 192.168.0.100
Node Without Pod: 192.168.122.250
Node With Pod: 192.168.122.167
Pod IP of Service: 10.122.1.21
Service Port: 8080
Traffic is sent from Client Host
to Node Without Pod
where the traffic get's SNAT'd through IPVS (default function of IPVS) and sent to Node With Pod
. On Node With Pod
it gets denied by the network policy since the source address is now the address of Node Without Pod
instead of Client Host
.
Deployment Setup:
---
apiVersion: v1
kind: Service
metadata:
name: hello-hello-app
spec:
type: NodePort
ports:
- port: 8080
nodePort: 32000
targetPort: http
protocol: TCP
name: http
selector:
app.kubernetes.io/name: hello-app
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-hello-app
labels:
app.kubernetes.io/name: hello-app
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: hello-app
template:
metadata:
labels:
app.kubernetes.io/name: hello-app
spec:
nodeSelector:
kubernetes.io/hostname: k8s-2
containers:
- name: hello-app
image: "nextsux/hello-app:1"
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 8080
protocol: TCP
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: "troublemaker"
spec:
podSelector:
matchLabels:
app.kubernetes.io/name: hello-app
policyTypes:
- Ingress
ingress:
- from:
- ipBlock:
cidr: 192.168.0.100/32
tcpdump from Node Without Pod
23:38:57.185836 IP 192.168.0.100.49392 > 192.168.122.250.32000: Flags [S], seq 1938619889, win 64240, options [mss 1460,sackOK,TS val 2486647264 ecr 0,nop,wscale 7], length 0
23:38:57.185906 IP 192.168.122.250.55937 > 10.122.1.21.8080: Flags [S], seq 1938619889, win 64240, options [mss 1460,sackOK,TS val 2486647264 ecr 0,nop,wscale 7], length 0
tcpdump from Node With Pod
23:38:57.221024 IP 192.168.122.250.55937 > 10.122.1.21.8080: Flags [S], seq 1938619889, win 64240, options [mss 1460,sackOK,TS val 2486647264 ecr 0,nop,wscale 7], length 0
I would imagine that this is a common problem for all k8s network frameworks. Do you happen to have any knowledge of how Calico or others address this?
I would imagine that this is a common problem for all k8s network frameworks. Do you happen to have any knowledge of how Calico or others address this?
@aauren Whether its kube-proxy or Kube-router acting as service proxy when external client access the service, to ensure symmetric routing (i.e return traffic goes through same node) traffic is SNAT'ed
please see https://kubernetes.io/docs/tutorials/services/source-ip/#source-ip-for-services-with-type-nodeport Its inherent problem
One can use services with externalTrafficPolicy=local
set to retain the source IP to enforce networtk policies. Direct Server Retrun
is another option where client IP can be retained and network policies can be enforced.
@murali-reddy that makes sense. Given that it's described in the k8s documentation that this is a pitfall of proxy'd service traffic it seems to me that this is just an accepted problem upstream.
Two things that it would be worth getting your opinion on:
Do you think that there is any place where it would be appropriate to mention this in our documentation with a link to the upstream reference?
Agree. That should be documented.
kube-router already keeps an ipset with all of the node IPs in it. At a logical level it would be pretty easy for kube-router to allow traffic from this ipset via a kube-router annotation on a network policy or service.
I am afraid that would give nodes (e.g. a compramised node) unrestricted access to the pod which is not desirable. In general problem of preserving source IP is not anything specific to Kubernetes. AFAIK there is no one-fit soulution.
In case of Kubernetes setting externalTrafficPolicy=local
for all the frontend services (that recieve north-south traffic) seems to be the common practice. For e.g. https://github.com/kubernetes/enhancements/issues/27
I've just tried with calico and I can confirm calico has the same issue @aauren
Hi we use kube-router to advertise the service and pod cidr with bgp Now we want to limit the access the pod via networkpolicy.
example deployment:
Example policy
Default deny
The service-vip is announced via anycast from all nodes But it only works when a client from 172.17.88.0/24 randomly hit the k8s-worker-3 node which have the nginx pod on it. All other nodes forward the incoming traffic from the nginx service-ip via node-ip towards the nginx pod-ip, which never hit the networkpolicy role as the source ip is completely different.
Maybe someone can give me a hint to resolve this issue?