Open wayne-cheng opened 1 week ago
@wayne-cheng, thanks for the thorough analysis!
We actually considered doing something like this early on in Calico but we found that it only works if your network is permissive to unknown source IPs. If your network implements reverse path filtering (RPF) then the pod-to-host traffic will be dropped. In any case you get asymmetric routing where the return traffic goes over the tunnel. This can cause other problems, like for example different MTUs for ingress/egress traffic. It was a decision to trade off a potential performance hit when no SNAT would work vs. breaking multiple use-cases when it wouldn't work.
We could definitely look into making this a configurable setting, but shouldn't unconditionally disable SNAT for cluster hosts/nodes.
@coutinhop OK, I have now defined a DisableHostSubnetNATExclusion
field in felixconfig. PTAL my submitted PR #8961.
// When set to true and ip pool setting `natOutgoing` is true, packets sent from Calico networked containers in this pool
// to cluster host subnet will not be excluded from being masqueraded. [Default: false]
DisableHostSubnetNATExclusion bool `json:"disableHostSubnetNATExclusion,omitempty"`
This way, when felix
generates SNAT iptable rules on Linux node, it will use this field to make decisions.
However, for Windows, I found that NAT rules are applied during the call of the CNI plugin to add the network, rather than being implemented by felix itself. If make this a configurable setting, it means that this setting will only take effect after the Pod is created, as stated in the Calico documentation.
The code (@song-jiang) that was previously removed placed this windows_disable_host_subnet_nat_exclusion
logic in the CNI configuration file, requiring individual settings on each host, maybe it is not an ideal solution.
I think this config field can be placed in the global FelixConfiguration
named default
, and the Calico Windows CNI implementation can fetch it via CalicoClient
. However, this way would ignore the setting of host configuration files or environment variables.
If you think it feasible, I will proceed with implementing it for Windows immediately.
When I enable the setting
NatOutgoing
, I notice that the pod's traffic is also SNATted when it accessing local cluster hosts. I think it is unnecessary as it cause some performance degradation.Below is the result of
tcpdump
capturingping
packets from the pod (177.65.1.1
) to a cluster host (192.168.1.83
), which confirms this behavior: Pod (177.65.1.1
) -> Host (192.168.1.83
) The pod (177.65.1.1
):The host (
192.168.1.84
) where the pod deployed on:So, I manually modified the iptables rules generated by Calico to include matching for the cluster host addresses.
After making this change, the traffic from the Pod to the cluster hosts is no longer SNATed:
In the Calico source code, this is a simple change. I hope you can review my upcoming PR.
Recently, I have also been testing
Calico for Windows
and encountered the same issue.However, it is more severe as it prevents cluster hosts from connecting to Windows containers. Below is the
Wireshark
capture of ping packets on the Windows server (192.168.1.74
):Linux host (
192.168.1.83
) -> Windows container (177.65.1.175
):Windows container (
177.65.1.175
) -> Linux host (192.168.1.83
):I modified the
C:\Program Files\containerd\cni\conf
file to add local cluster hosts (192.168.1.0/24
) to the ExceptionList, it resolved the problem(but it only removes the NAT).The Calico documentation mentions the natOutgoing setting for Windows, but I found that it does not match the current behavior. Additionally, I noticed that
windows_disable_host_subnet_nat_exclusion
has been removed from the code, and I am unsure why this change was made.I think we can add back the logic to exclude cluster hosts. If you agree, I'd like do some test and then submit another PR.
Your Environment