Open kimoonkim opened 6 years ago
@kimoonkim I ran into this issue. How can I fix it? All of my datanodes now spam the logs with "2018-07-05 09:57:52,800 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: hadoop-hdfs-nn-1.hadoop-hdfs-nn.my-namespace.svc.cluster.local:9000"
@kimoonkim
I tried to delete the namenode pod that was selected as active nn, but the other NN could not be elected as the leader. The other namenode tried to get elected as leader several times, but all those tries fail with this error. Why is it putting out local host is: (unknown);
?
2018-07-05 10:06:55,700 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.setSafeMode from 172.102.3.63:54950 Call#0 Retry#0: org.apache.hadoop.ipc.StandbyException: Operation category READ is not supported in state standby
2018-07-05 10:07:22,646 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode hadoop-hdfs-nn-1.hadoop-hdfs-nn.my-namespace.svc.cluster.local:9000
2018-07-05 10:07:22,646 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NN
java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "hadoop-hdfs-nn-1.hadoop-hdfs-nn.my-namespace.svc.cluster.local":9000; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:744)
at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:409)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1518)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:148)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:273)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:315)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
Caused by: java.net.UnknownHostException
... 14 more
Output of /etc/hosts:
# cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
172.101.3.207 hadoop-hdfs-nn-0.hadoop-hdfs-nn.my-namespace.svc.cluster.local hadoop-hdfs-nn-0
Output of $HOSTNAME:
# echo $HOSTNAME
hadoop-hdfs-nn-0
Could we use a ClusterIP service for the NameNode so that we have a non-changing IP address?
You could utilise this sts feature released in 1.9
StatefulSet controller will create a label for each Pod in a StatefulSet. The label is named statefulset.kubernetes.io/pod-name and it is equal to the name of the Pod. This allows users to create a Service per Pod to expose a connection to individual Pods.
I can confirm the above suggestion works as expected. If you use a Service-per-pod in front of both of the namenodes and journalnodes with the selector for that namenode then they have fixed IP addresses so the DNS caching won't cause issues (make sure to change the DNS names in the *-site.xml to reflect the new naming scheme, obviously)
e.g.
apiVersion: v1
kind: Service
metadata:
name: hdfs-namenode-0
labels:
app: hdfs-namenode
chart: hdfs-namenode-k8s-0.1.0
release: hdfs
spec:
ports:
- port: 8020
name: fs
- port: 50070
name: http
selector:
app: hdfs-namenode
release: hdfs
statefulset.kubernetes.io/pod-name: hdfs-namenode-0
(I wonder if this breaks data locality though)
Similar to #42. Even after a datanode successfully registered with namenodes, some namenodes may resart. The datanodes would be left with stale DNS entries in the local JVM cache.
We may have to tune the JVM cache so it expires soon enough. Or we may let the liveliness probe also assert on the other entry in the datanode JMX that has the mapping of namenode hostname and IP. i.e. If the mapping is stale, just let the datanode pods crash. For the second case, we may have to randomize the crash timepoints, so we don't lose all datanodes simultaneously.