Closed zzbmmbzz closed 1 month ago
So, you've run in to one of the rough edges of NetworkPolicy (not just Cilium, BTW) and policies. The behavior you're experiencing is probably correct, if totally unexpected.
You are connecting from pod A to pod B via a NodePort service. This means you do not connect to pod B, but to node 1's IP address:
graph LR
A[pod A]
B[pod B]
1[node 1]
A --src A, dst 1 --> 1 -- src 1, dst B --> B
So, node 1 is doing the service translation in this case, and because it defers the routing decision until after service translation, it can treat this as a Pod-to-Pod flow and preserve the source identity. This is the same as if you were to connect to a ClusterIP service.
However, when you connect to the NodePort service on another node, the flow is different:
graph LR
A[pod A]
B[pod B]
1[node 1]
2[node 2]
A --src A, dst 2 --> 1 -- nat! src 1, dst 2 --> 2 -- src 2, dst B -->B
Because you are connecting to node 2, the traffic needs to exit node 1, which means it is NATted. That means the source IP is that of node 1.
There are two potential fixes:
host
and remote-node
entities in your policy.Can you try these and see if they fix your problem?
So, you've run in to one of the rough edges of NetworkPolicy (not just Cilium, BTW) and policies. The behavior you're experiencing is probably correct, if totally unexpected.
You are connecting from pod A to pod B via a NodePort service. This means you do not connect to pod B, but to node 1's IP address:
graph LR A[pod A] B[pod B] 1[node 1] A --src A, dst 1 --> 1 -- src 1, dst B --> B
So, node 1 is doing the service translation in this case, and because it defers the routing decision until after service translation, it can treat this as a Pod-to-Pod flow and preserve the source identity. This is the same as if you were to connect to a ClusterIP service.
However, when you connect to the NodePort service on another node, the flow is different:
graph LR A[pod A] B[pod B] 1[node 1] 2[node 2] A --src A, dst 2 --> 1 -- nat! src 1, dst 2 --> 2 -- src 2, dst B -->B
Because you are connecting to node 2, the traffic needs to exit node 1, which means it is NATted. That means the source IP is that of node 1.
The fix
There are two potential fixes:
- Allow access from the
host
andremote-node
entities in your policy.- Connect to a ClusterIP, not a NodePort.
Can you try these and see if they fix your problem?
I connect from outside cluster, not inside cluster, so Connect to a ClusterIP
is not possible,
Allow access from the host and remote-node entities in your policy.
entities cluster
is include host
and remote-node
Cluster is the logical group of all network endpoints inside of the local cluster. This includes all Cilium-managed endpoints of the local cluster, unmanaged endpoints in the local cluster, as well as the host, remote-node, and init identities.
Oh, my apologies, I didn't realize the connection was external from the cluster (even though you said it in the title). That does make it a bit more interesting.
I suspect we need to re-look-up the destination identity for policy, rather than trusting the identity from the GENEVE headers. I'll ask for a bit more info.
Aha, after chatting with the magnificent @networkop, he observed that this was fixed in in v1.15 in #29155
I note you are on a quite old cilium version; please consider upgrading. The fix was backported to v1.14.
Many thanks, @squeed . My issue fixed after upgrade to v1.15
Is there an existing issue for this?
What happened?
I applied cilium network policy and got happen when trying call to service inside cluster from outside with NodePort
Details: I have deployed service netpol2-nginx on node 10.30.80.140 and expose node port 443:32044 And the Cilium network Policy allow ingress from 10.164.33.200/32 call to netpol2-nginx port 443
The happen is:
In case 2 the source identity looks like not correct ( identity 2 - world), and expectation is 16777221
As there any solution for this issue?
Thanks
How can we reproduce the issue?
SSH to 10.164.33.200 and trying to connect
Cilium helm values
Cilium Version
Client: 1.14.2 a6748946 2023-09-09T20:59:33+00:00 go version go1.20.8 linux/amd64 Daemon: 1.14.2 a6748946 2023-09-09T20:59:33+00:00 go version go1.20.8 linux/amd
Kernel Version
Linux zl-dev-k8s-worker-10-30-80-157 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 18 02:06:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.10+rke2r2", GitCommit:"b8609d4dd75c5d6fba4a5eaa63a5507cb39a6e99", GitTreeState:"clean", BuildDate:"2023-11-02T16:18:02Z", GoVersion:"go1.20.10 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"}
Regression
No response
Sysdump
No response
Relevant log output
No response
Anything else?
No response
Cilium Users Document
Code of Conduct