apache-spark-on-k8s / kubernetes-HDFS

Repository holding configuration files for running an HDFS cluster in Kubernetes
Apache License 2.0
397 stars 185 forks source link

Shutting down DataNode at java.net.UnknownHostException #62

Open maver1ck opened 6 years ago

maver1ck commented 6 years ago

I'm getting following exception when running this chart on AWS. Important part is that I'm getting it only on one node (on 3 available). Destroing and creating cluster didn't help.

18/10/21 15:50:57 FATAL datanode.DataNode: Exception in secureMain
java.net.UnknownHostException: ip-10-111-10-6.eu-west-1.compute.internal.kubelet.kube-system.svc.cluster.local: ip-10-111-10-6.eu-west-1.compute.internal.kubelet.kube-system.svc.cluster.local: Name or service not known
        at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
        at org.apache.hadoop.security.SecurityUtil.getLocalHostName(SecurityUtil.java:190)
        at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:210)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2255)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2304)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2481)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2505)
Caused by: java.net.UnknownHostException: ip-10-111-10-6.eu-west-1.compute.internal.kubelet.kube-system.svc.cluster.local: Name or service not known
        at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
        at java.net.InetAddress.getLocalHost(InetAddress.java:1500)
        ... 6 more
18/10/21 15:50:57 INFO util.ExitUtil: Exiting with status 1
18/10/21 15:50:57 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException: ip-10-111-10-6.eu-west-1.compute.internal.kubelet.kube-system.svc.cluster.local: ip-10-111-10-6.eu-west-1.compute.internal.kubelet.kube-system.svc.cluster.local: Name or service not known
************************************************************/

Any ideas ?

Bowiemb commented 5 years ago

We're having similar issue.

tmcancode commented 5 years ago

I'm having similar issue :( @kimoonkim could you give some help?

kimoonkim commented 5 years ago

This sounds similar to #48, which discusses a DNS issue and possible solutions. PTAL.