nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
637 stars 116 forks source link

HDFS Data nodes from slaves can't connect to the name node #206

Closed pilgrimkst closed 7 years ago

pilgrimkst commented 7 years ago

Hi, i have problems working with hdfs. In order to access to the hdfs i was need to start data nodes (cd /home/ec2-user/hadoop && sbin/hadoop-daemon.sh stop datanode) But after i am starting datanodes on all instances (both master and slaves) there is only one datanode that is registered from the master (ie running locally with namenode) there is connectivity between nodes, but i have following error message in both datanode and namenode logs:

2017-07-26 16:40:40,493 INFO  [IPC Server handler 5 on 9000] ipc.Server (Server.java:run(2070)) - IPC Server handler 5 on 9000, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 172.33.9.26:59572 Call#209 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=172.33.9.26, hostname=172.33.9.26): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=39d7a850-7e68-4d9f-a5be-becb81af752f, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-94f28171-a765-4610-8fd9-3b6a8be02383;nsid=280054336;c=0)
    at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:873)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1286)
    at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:96)
    at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28752)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

Where 172.33.9.26 is internal ip of one of my slaves (errors are logged from all slaves, i just added one for reference

nchammas commented 7 years ago

In order to access to the hdfs i was need to start data nodes (cd /home/ec2-user/hadoop && sbin/hadoop-daemon.sh stop datanode)

Hmm, why do you need to do this? Flintrock should start up HDFS automatically for you as long as you specify --install-hdfs (or the equivalent in config.yaml).

Also, does your VPC have an Internet Gateway attached? Flintrock does not currently support private VPCs (#14).

pilgrimkst commented 7 years ago

Yes, i am installing hdfs with

launch:
  install-hdfs: True

I will check VPC settings, and write back

pilgrimkst commented 7 years ago

this is my flintrock config https://gist.github.com/pilgrimkst/204b000e195e543d54a159cebed63168 Also I want to mention, that all spark workers are initialized

pilgrimkst commented 7 years ago

Ok, I think i had found an issue, I removed one security group and it started to work. So I am closing issue

nchammas commented 7 years ago

Ah, so one of the additional security groups you had configured on launch was interfering with Flintrock?

pilgrimkst commented 7 years ago

Yeah, we had two flintrock clusters, and i saw on the instances SG with name flintrock, so i thought it would be a good idea to add it. But I guess this one was preventing to create it's own flintrock group.