Open echarles opened 6 years ago
Hi @echarles. Thanks for trying out the plugin.
Yes, the plugin is supposed to return a network path for a cluster node name or pod IP address. The output you shared has some good results, but it also has bad results:
/default-rack/ip-10-0-0-204 /default-rack/ip-10-0-2-115 /default-rack/ip-10-0-2-230 /default-rack/ip-10-0-2-246 /default-rack/ip-10-0-3-43 /default-rack/ip-10-0-3-44 /default-rack/ip-10-0-3-74
The above paths are returned for cluster nodes. And they are correct.
However, the following entries, returned for pod IP addresses, are bad responses. They are basically the default value indicating lookup failure.
/default-rack/default-nodegroup /default-rack/default-nodegroup /default-rack/default-nodegroup /default-rack/default-nodegroup
First of all, the plugin can handle only IP addresses of pods as input, not pod names. Because the kube-dns does not support pod name to IP translation. So only the first input "10.0.3.74" would be a valid input. But even for that we get the default path. So something is wrong.
What network provider are you using? The plugin is known to work with the kubenet provider, FYI.
More importantly, is your network provider correctly setting podCIDR on cluster nodes? You can check this using the following command, I think. podCIDR value is set inside each node spec:
$ kubectl get nodes -o json | grep -i cidr
@kimoonkim Thank you for your support, clear explanations and insights.
So the parameter of the resolve method must be an IP address (not a pod name) (btw would be good to add a small javadoc line in the PodCIDRToNodeMapping
class - Before your explanation, I had to go to the super classe and read Resolves a list of DNS-names/IP-addresses and returns back a list of switch information (network paths)
, hence my tests.
Btw, you say kube-dns does not support reverse lookup (name to ip) - Is there a way to get this working?
I have redeployed a new cluster with calico enable (like the previous one) with kubectl apply -f https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/hosted/kubeadm/1.6/calico.yaml
.
The nodes:
NAME STATUS ROLES AGE VERSION
ip-10-0-0-108.us-west-2.compute.internal Ready <none> 2h v1.8.4
ip-10-0-0-128.us-west-2.compute.internal Ready <none> 3h v1.8.4
ip-10-0-0-199.us-west-2.compute.internal Ready <none> 3h v1.8.4
ip-10-0-0-210.us-west-2.compute.internal Ready master 3h v1.8.4
ip-10-0-0-36.us-west-2.compute.internal Ready <none> 2h v1.8.4
ip-10-0-0-84.us-west-2.compute.internal Ready <none> 2h v1.8.4
ip-10-0-0-93.us-west-2.compute.internal Ready <none> 2h v1.8.4
Now, with
val networkPathDirs = plugin.resolve(List(
"ip-10-0-0-108.us-west-2.compute.internal",
"ip-10-0-0-128.us-west-2.compute.internal",
"ip-10-0-0-199.us-west-2.compute.internal",
"ip-10-0-0-210.us-west-2.compute.internal",
"ip-10-0-0-36.us-west-2.compute.internal",
"ip-10-0-0-84.us-west-2.compute.internal",
"ip-10-0-0-93.us-west-2.compute.internal",
"unkown"
))
networkPathDirs.foreach(println)
I receive
/default-rack/ip-10-0-0-108
/default-rack/ip-10-0-0-128
/default-rack/ip-10-0-0-199
/default-rack/ip-10-0-0-210
/default-rack/ip-10-0-0-36
/default-rack/ip-10-0-0-84
/default-rack/ip-10-0-0-93
/default-rack/default-nodegroup
Which validates the test (network path are correctly resolved) and good thing, the hdfs --loglevel DEBUG dfs -cat /hosts
show a connection to the local datanode.
The test is quite manual. Did you think on a way to automate this (like a HdfsLocatityTest
classy stuff)?
For info, kubectl get nodes -o json | grep -i cidr
returns the following which sounds good to me:
"podCIDR": "192.168.7.0/24",
"podCIDR": "192.168.2.0/24",
"podCIDR": "192.168.1.0/24",
"podCIDR": "192.168.0.0/24",
"podCIDR": "192.168.10.0/24",
"podCIDR": "192.168.8.0/24",
"podCIDR": "192.168.9.0/24",
mmh, actually, my setup resolves hostname
-> network path
It does not resolve ip address
-> network path
Am I missing something with this calico setup?
Ah, you are using calico. I believe calico does not need this PodCIDRToNodeMapping
plugin. And I remember seeing calico setting podCIDRs to wrong values. That's probably why the plugin does not resolve pod IPs in your test.
I suggest to check the calico's nat-outgoing
option. When I tried Calico on EC2 using kops, kops was setting this nat-outgoing
automatically. Your HDFS namenode may already work without this plugin. From README.md of the plugin dir:
Calico is a popular non-overlay network provider. It turns out Calico can be also configured to do NAT between pod subnet and node subnet thanks to the nat-outgoing option. The option can be easily turned on and is enabled by default.
@kimoonkim I've given another try to Calico setting explicitly the nat-outgoing
with calicoctl
and get the same result (only hostname are resolved).
cat << EOF | calicoctl apply -f -
apiVersion: v1
kind: ipPool
metadata:
cidr: 192.168.0.0/16
spec:
ipip:
enabled: true
nat-outgoing: true
EOF
Same with Flannel (only hostname are resolved) and I am even losing locality when I hdfs cat
with loglevel DEBUG.
If you use Calico with nat-outgoing
or Flannel, then you do not need to use this PodCIDRToNodeMapping plugin. And the plugin would not do the right thing for those network providers. The namenode should can see the physical IP address of the underlying K8s cluster nodes without the plugin because the network providers rewrite pod packets by replacing pod IPs with cluster node IPs. Using the plugin only confuses the namenode.
Can you please remove the plugin from the namenode and see if the data locality works?
I have tested the PodCIDRToNodeMapping on a k8s with the provided charts from my findings, I wonder if it behaves as expected: I have to give the nodename as input to the PodCIDRToNodeMapping.resolve method to get back the paths (see details hereafter).
So it is correct that the mapping is : nodename -> path ?
From the zeppelin pod, I run:
and gets back: