big-data-europe / docker-hadoop

Apache Hadoop docker image
2.18k stars 1.27k forks source link

Cannot write HDFS from Java #98

Open nilesim opened 3 years ago

nilesim commented 3 years ago

I used this repository to set up a hadoop cluster in Docker according to this tutorial. I have changed the docker-compose file to this because the versions in the tutorial are old.

These are the changes I made: https://github.com/big-data-europe/docker-hadoop/compare/master...nilesim:master

Now it is up and running and I can monitor it from http://localhost:9870/. 3 datanodes are up and running.

But when I try to write it from java with this code it gives error, creates the directory and file but could not write in it:

 public static void writeFileToHDFS() throws IOException {
        Configuration configuration = new Configuration();
        configuration.set("fs.defaultFS", "hdfs://localhost:9000");
        FileSystem fileSystem = FileSystem.get(configuration);
        //Create a path
        String fileName = "read_write_hdfs_example.txt";
        Path hdfsWritePath = new Path("/user/javadeveloperzone/javareadwriteexample/" + fileName);
        FSDataOutputStream fsDataOutputStream = fileSystem.create(hdfsWritePath,true);
        BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fsDataOutputStream,StandardCharsets.UTF_8));
        bufferedWriter.write("Java API to write data in HDFS");
        bufferedWriter.newLine();
        bufferedWriter.close();
        fileSystem.close();
    }

repository: (https://github.com/nsquare-jdzone/hadoop-examples/tree/master/ReadWriteHDFSExample) from this tutorial: https://javadeveloperzone.com/hadoop/java-read-write-files-hdfs-example/

and I monitored from (http://localhost:9870/explorer.html#/) that the directory and file is created when code run but size is 0 .

Hadoop docker error logs are:

namenode           | 2021-02-11 06:46:56,941 INFO namenode.FSEditLog: Number of transactions: 50 Total time for transactions(ms): 53 Number of transactions batched in Syncs:
106 Number of syncs: 44 SyncTimes(ms): 184
namenode           | 2021-02-11 06:46:59,420 INFO hdfs.StateChange: BLOCK* allocate blk_1073741849_1025, replicas=172.18.0.6:9866, 172.18.0.5:9866, 172.18.0.4:9866 for /user/javadeveloperzone/javareadwriteexample/read_write_hdfs_selin.txt
namenode           | 2021-02-11 06:47:01,223 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology
namenode           | 2021-02-11 06:47:01,223 WARN protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 1 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK], removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
namenode           | 2021-02-11 06:47:01,223 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable:  unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
namenode           | 2021-02-11 06:47:01,224 INFO hdfs.StateChange: BLOCK* allocate blk_1073741850_1026, replicas=172.18.0.5:9866, 172.18.0.4:9866 for /user/javadeveloperzone/javareadwriteexample/read_write_hdfs_selin.txt
namenode           | 2021-02-11 06:47:24,009 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 2 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology
namenode           | 2021-02-11 06:47:24,011 WARN protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 2 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK], removed=[DISK, DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
namenode           | 2021-02-11 06:47:24,013 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 2 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable:  unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
namenode           | 2021-02-11 06:47:24,014 INFO hdfs.StateChange: BLOCK* allocate blk_1073741851_1027, replicas=172.18.0.4:9866 for /user/javadeveloperzone/javareadwriteexample/read_write_hdfs_selin.txt
namenode           | 2021-02-11 06:47:46,800 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology
namenode           | 2021-02-11 06:47:46,800 WARN protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 3 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK], removed=[DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
namenode           | 2021-02-11 06:47:46,800 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable:  unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
namenode           | 2021-02-11 06:47:46,814 INFO ipc.Server: IPC Server handler 2 on default port 9000, call Call#8 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.18.0.1:44376
namenode           | java.io.IOException: File /user/javadeveloperzone/javareadwriteexample/read_write_hdfs_selin.txt could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
namenode           |    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)
namenode           |    at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
namenode           |    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)
namenode           |    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
namenode           |    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
namenode           |    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
namenode           |    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
namenode           |    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
namenode           |    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
namenode           |    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
namenode           |    at java.security.AccessController.doPrivileged(Native Method)
namenode           |    at javax.security.auth.Subject.doAs(Subject.java:422)
namenode           |    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
namenode           |    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

and java errors are:

Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/javadeveloperzone/javareadwriteexample/read_write_hdfs_selin.txt could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)
    at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497)
    at org.apache.hadoop.ipc.Client.call(Client.java:1443)
    at org.apache.hadoop.ipc.Client.call(Client.java:1353)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
    at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:510)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
    at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1078)
    at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1865)
    at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668)
    at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
BDubas commented 3 years ago

I assume this docker-compose is not working for Mac. I have the same issue. It looks like hdfs attempts to connect to datanode via docker internal IP address, and it fails cause "Docker Desktop for Mac can’t route traffic to containers". If you exec into namenode and try any action that requires interaction with datanode, it should be successful as that IP is accessible from that network.

nilesim commented 3 years ago

No it is not relevant about Mac. But you are right, it is about accessibility outside of docker. I am using windows desktop as a work laptop. And I am using docker-compose in different projects and for creating dev environment all the time.

This is a problem about this particular docker-compose file does not give permission to access datanodes ports or something. I tried to open hadoop default ports like 50070:50070 but somehow it did not worked and I am kindly asking if anyone knows how to configure it to access from my local. Because for development I need an hadoop cluster up and running and open to write hdfs.

nilesim commented 3 years ago

I have already shared github repo at beginning but I am adding the docker-compose file I used also to here if anyone knows what is wrong with the ports:

version: "3"

services:
  namenode:
    image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8
    container_name: namenode
    restart: always
    ports:
      - 9870:9870
      - 9000:9000
    volumes:
      - hadoop_namenode:/hadoop/dfs/name
    environment:
      - CLUSTER_NAME=test
    env_file:
      - ./hadoop.env

  datanode1:
    image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
    container_name: datanode1
    restart: always
    volumes:
      - hadoop_datanode1:/hadoop/dfs/data
    environment:
      SERVICE_PRECONDITION: "namenode:9870"
    env_file:
      - ./hadoop.env

  datanode2:
    image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
    container_name: datanode2
    restart: always
    volumes:
      - hadoop_datanode2:/hadoop/dfs/data
    environment:
      SERVICE_PRECONDITION: "namenode:9870"
    env_file:
      - ./hadoop.env

  datanode3:
    image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
    container_name: datanode3
    restart: always
    volumes:
      - hadoop_datanode3:/hadoop/dfs/data
    environment:
      SERVICE_PRECONDITION: "namenode:9870"
    env_file:
      - ./hadoop.env

  resourcemanager:
    image: bde2020/hadoop-resourcemanager:2.0.0-hadoop3.2.1-java8
    container_name: resourcemanager
    restart: always
    environment:
      SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode1:9864 datanode2:9864"
    env_file:
      - ./hadoop.env

  nodemanager1:
    image: bde2020/hadoop-nodemanager:2.0.0-hadoop3.2.1-java8
    container_name: nodemanager
    restart: always
    environment:
      SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode1:9864 datanode2:9864 resourcemanager:8088"
    env_file:
      - ./hadoop.env

  historyserver:
    image: bde2020/hadoop-historyserver:2.0.0-hadoop3.2.1-java8
    container_name: historyserver
    restart: always
    environment:
      SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode1:9864 datanode2:9864 resourcemanager:8088"
    volumes:
      - hadoop_historyserver:/hadoop/yarn/timeline
    env_file:
      - ./hadoop.env

volumes:
  hadoop_namenode:
  hadoop_datanode1:
  hadoop_datanode2:
  hadoop_datanode3:
  hadoop_historyserver:
nilesim commented 3 years ago

I have added this port to namenode ports: - 50070:50070 and following to datanode1

    ports: 
      - 50010:50010

It gave me docker: error response from daemon: Ports are not available: listen tcp 0.0.0.0/50070: bind: An attempt was made to access a socket in a way forbidden by its access permissions. error. So I run following commands and ports are open now:

net stop winnat
docker-compose up
net start winnat

However, error continues when I try to write to hdfs trough java:

namenode           | 2021-02-17 13:15:03,719 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable:  unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
namenode           | 2021-02-17 13:15:03,721 INFO ipc.Server: IPC Server handler 8 on default port 9000, call Call#8 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.18.0.1:36774
namenode           | java.io.IOException: File /user/javadeveloperzone/javareadwriteexample/read_write_hdfs_selin.txt could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
namenode           |    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)
namenode           |    at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
namenode           |    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)
namenode           |    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
namenode           |    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
namenode           |    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
namenode           |    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
namenode           |    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
namenode           |    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
namenode           |    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
namenode           |    at java.security.AccessController.doPrivileged(Native Method)
namenode           |    at javax.security.auth.Subject.doAs(Subject.java:422)
namenode           |    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
namenode           |    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)
johanneshiry commented 3 years ago

I second @nilesim on this. Same issue here when I try to setup a local hadoop cluster.

xrafhue commented 3 years ago

I have same issue, i found this question on stackoverflow but this solution doesn't work for me.

https://stackoverflow.com/questions/45276427/unable-connect-to-docker-container-outside-docker-host

In docker-compose i add ports : note : in my case i cannot use port 9000 because it is already use in my workstation so i use port 9010

[...]
  namenode:
[...]
    ports:
      - 9870:9870
      - 9010:9000
      - 50070:50070
[...]  
  datanode:
    hostname: datanode.company.com
  [...] 
    ports:
      - "9864:9864"
      - "50010:50010"
      - "50020:50020"

In hadoop.env i add two variables : dfs_client_use_datanode_hostname and dfs_datanode_use_datanode_hostname :

CORE_CONF_fs_defaultFS=hdfs://namenode:9000
CORE_CONF_hadoop_http_staticuser_user=root
CORE_CONF_hadoop_proxyuser_hue_hosts=*
CORE_CONF_hadoop_proxyuser_hue_groups=*
CORE_CONF_io_compression_codecs=org.apache.hadoop.io.compress.SnappyCodec
CORE_CONF_dfs_client_use_datanode_hostname=true
CORE_CONF_dfs_datanode_use_datanode_hostname=true

HDFS_CONF_dfs_webhdfs_enabled=true
HDFS_CONF_dfs_permissions_enabled=false
HDFS_CONF_dfs_namenode_datanode_registration_ip___hostname___check=false
HDFS_CONF_dfs_client_use_datanode_hostname=true
HDFS_CONF_dfs_datanode_use_datanode_hostname=true
[...]

In my C:\Windows\System32\drivers\etc\hosts

127.0.0.1 namenode
127.0.0.1 datanode
127.0.0.1 datanode.company.com

In namenode i run this cmd to leave safemode

hadoop dfsadmin -safemode leave

In my client i have this conf

core-site.xml 
<configuration>
  <property>
    <name>fs.defaultFS</name><value>hdfs://localhost:9010</value>
  </property>
  <property>
    <name>dfs.client.use.datanode.hostname</name><value>true</value>
</property>
<property>
    <name>dfs.datanode.use.datanode.hostname</name><value>true</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
  <property>
    <name>dfs.client.use.datanode.hostname</name><value>true</value>
</property>
<property>
    <name>dfs.datanode.use.datanode.hostname</name><value>true</value>
</property>
</configuration>

I have the error with this cmd :

hdfs dfs -copyFromLocal test.txt hdfs://localhost:9010/

And the error is :

2021-03-14 10:45:47,412 INFO hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073741883_1059
java.net.ConnectException: Connection refused: no further information
        at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253)
        at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1725)
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1679)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2021-03-14 10:45:47,422 WARN hdfs.DataStreamer: Abandoning BP-1490874927-172.20.0.9-1615649274799:blk_1073741883_1059
2021-03-14 10:45:47,475 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[172.19.0.8:9866,DS-034b55e3-ae2c-45b9-9eb0-25adaed2251d,DISK]
2021-03-14 10:45:47,529 WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /test.txt._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)
        at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497)
        at org.apache.hadoop.ipc.Client.call(Client.java:1443)
        at org.apache.hadoop.ipc.Client.call(Client.java:1353)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy14.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:510)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy15.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1078)
        at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1865)
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
copyFromLocal: File /test.txt._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

And in my java app i have this error

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File myfile.parquet/_temporary/0/_temporary/attempt_20210314103844_0037_m_000000_0/part-00000-0736368e-e16b-4d87-b347-0d2e8a63b646-c000.snappy.parquet could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)
johanneshiry commented 3 years ago

Question to @xrafhue / @nilesim: are you also using macOS? After several hours I think finally understand why this happens: docker on macOS cannot route traffic to docker IPs (see docker docs). As a consequence, the whole setup will not work on macOS but on Linux only, bc the namenode only sees and provides the docker "internal" IP (which is inside the created docker network) which is then passed to your application.

If my understanding of this issue is correct, the only solutions I see atm:

EDIT: One could also think about manual adjusting hosts files and do a local dns name resolution for the DNs running inside docker. However, I don't really like this approach ...

xrafhue commented 3 years ago

@johanneshiry : Docker on Windows on my case

I think your analysis is good, i am sur that if i run my app as an image in docker in the same network it will Work, i will try to install an Ubuntu and try it on Linux.

nilesim commented 3 years ago

@johanneshiry : Docker on Windows also. And did net stop winnat before docker-compose up for disabled routings for some ports but still didn't work.

m8928 commented 3 years ago

HDFS clients communicate directly with data nodes when writing files If you want to work outside of the container, you need to expose port 9866 out and add the hostname of that container to the working PC hosts file and then work. IP of the container hostname can be specified as the IP of the actual Docker node.

this is the way I use it in the Docker swarm.

version: '3.3'

services:
  namenode:
    image: namenode-img:latest
    hostname: "namenode-{{.Node.ID}}"
    networks:
      - hadoop
    ports:
      - 9870:9870
      - 9000:9000
    volumes:
      - /data/container_data/namenode:/hadoop/dfs/name
      - /tmp:/tmp
    environment:
      - CLUSTER_NAME=HADOOP
    env_file:
      - ./hadoop.env
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.hostname == namenode
      restart_policy:
        condition: on-failure

  datanode:
    image: datanode-img:latest
    hostname: "datanode-{{.Node.ID}}"
    networks:
      - hadoop
    volumes:
      - /data/container_data/datanode:/hadoop/dfs/data
    environment:
      SERVICE_PRECONDITION: "namenode:9870"
    env_file:
      - ./hadoop.env
    ports:
     - target: 9864
       published: 9864
       mode: host
     - target: 9866
       published: 9866
       mode: host
    deploy:
      mode: global
      restart_policy:
        condition: on-failure
...
#!/bin/bash

LEADER_NODE=`docker node ls | grep Leader | awk '{ print $1 }' | egrep -v ^ID$`
LEADER_SERV_IP=`docker node inspect $LEADER_NODE | grep "Addr" | awk '{ print $2}' | grep -v ":" | sed 's/\"//g'`
NODE_LIST=`docker node ls | awk '{ print $1 }' | egrep -v ^ID$`

echo -e "$LEADER_SERV_IP\tnamenode-$LEADER_NODE"

for NODE_NAME in $NODE_LIST; do
    SERV_IP=`docker node inspect $NODE_NAME | grep "Addr" | awk '{ print $2}' | grep -v ":" | sed 's/\"//g'`
    echo -e "$SERV_IP\tdatanode-$NODE_NAME"
done
vaibhavshn commented 3 years ago

Hi, same issue here.

Cannot upload file from the webhdfs interface as well as pyhdfs.

Interface says cannot upload file.

Possible reason I see from the requests made and the response is, it makes request to:

http://f15e00af8d79:9864/webhdfs/v1/user/root/file.txt?op=CREATE&namenoderpcaddress=namenode:9000&createflag=&createparent=true&overwrite=false

The host mentioned doesn't exist.

Same issue when trying to create a file with pyhdfs.

fs.create('/user/root/shared/abc.txt','abc')

The above command gives errors too, since it tries to upload file to the same path, the host doesn't exist:

requests.exceptions.ConnectionError: HTTPConnectionPool(host='f15e00af8d79', port=9864): Max retries exceeded with url: /webhdfs/v1/user/root/shared/abc.txt?op=CREATE&user.name=root&namenoderpcaddress=namenode:9000&createflag=&createparent=true&overwrite=false (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda80aae9a0>: Failed to establish a new connection: [Errno -2] Name or service not known'))
BennettChina commented 2 years ago

I use @xrafhue way, and I resolve this issue.

this is my docker-compose.yml

version: "3"

services:
  namenode:
    image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8
    container_name: namenode
    restart: always
    ports:
      - 9870:9870
      - 9000:9000
    volumes:
      - ./hadoop/dfs/name:/hadoop/dfs/name
    environment:
      - CLUSTER_NAME=test
    env_file:
      - ./hadoop.env

  datanode1:
    image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
    container_name: datanode1
    hostname: datanode1
    restart: always
    ports:
      - 9866:9866
      - 9864:9864
    depends_on:
      - namenode
    volumes:
      - ./hadoop/dfs/data1:/hadoop/dfs/data
    environment:
      SERVICE_PRECONDITION: "namenode:9870"
    env_file:
      - ./hadoop.env
  datanode2:
    image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
    container_name: datanode2
    hostname: datanode2
    restart: always
    ports:
      - 9867:9866
      - 9865:9864
    depends_on:
      - namenode
    volumes:
      - ./hadoop/dfs/data2:/hadoop/dfs/data
    environment:
      SERVICE_PRECONDITION: "namenode:9870"
    env_file:
      - ./hadoop.env
  datanode3:
    image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
    container_name: datanode3
    hostname: datanode3
    restart: always
    ports:
      - 9868:9866
      - 9863:9864
    depends_on:
      - namenode
    volumes:
      - ./hadoop/dfs/data3:/hadoop/dfs/data
    environment:
      SERVICE_PRECONDITION: "namenode:9870"
    env_file:
      - ./hadoop.env

  resourcemanager:
    image: bde2020/hadoop-resourcemanager:2.0.0-hadoop3.2.1-java8
    container_name: resourcemanager
    restart: always
    environment:
      SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode1:9864 datanode2:9864 datanode3:9864 datanode1:9866 datanode2:9866 datanode3:9866"
    env_file:
      - ./hadoop.env

  nodemanager1:
    image: bde2020/hadoop-nodemanager:2.0.0-hadoop3.2.1-java8
    container_name: nodemanager
    restart: always
    environment:
      SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode1:9864 datanode2:9864 datanode3:9864 datanode1:9866 datanode2:9866 datanode3:9866 resourcemanager:8088"
    env_file:
      - ./hadoop.env

  historyserver:
    image: bde2020/hadoop-historyserver:2.0.0-hadoop3.2.1-java8
    container_name: historyserver
    restart: always
    environment:
      SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode1:9864 datanode2:9864 datanode3:9864 datanode1:9866 datanode2:9866 datanode3:9866 resourcemanager:8088"
    volumes:
      - ./hadoop/yarn/timeline:/hadoop/yarn/timeline
    env_file:
      - ./hadoop.env

this is my hadoop.env

CORE_CONF_fs_defaultFS=hdfs://namenode:9000
CORE_CONF_hadoop_http_staticuser_user=root
CORE_CONF_hadoop_proxyuser_hue_hosts=*
CORE_CONF_hadoop_proxyuser_hue_groups=*
CORE_CONF_io_compression_codecs=org.apache.hadoop.io.compress.SnappyCodec
CORE_CONF_hadoop_tmp_dir=/hadoop-data
CORE_CONF_dfs_client_use_datanode_hostname=true
CORE_CONF_dfs_datanode_use_datanode_hostname=true

HDFS_CONF_dfs_webhdfs_enabled=true
HDFS_CONF_dfs_permissions_enabled=false
HDFS_CONF_dfs_namenode_datanode_registration_ip___hostname___check=false
HDFS_CONF_dfs_client_use_datanode_hostname=true
HDFS_CONF_dfs_datanode_use_datanode_hostname=true

YARN_CONF_yarn_log___aggregation___enable=true
YARN_CONF_yarn_log_server_url=http://historyserver:8188/applicationhistory/logs/
YARN_CONF_yarn_resourcemanager_recovery_enabled=true
YARN_CONF_yarn_resourcemanager_store_class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
YARN_CONF_yarn_resourcemanager_scheduler_class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
YARN_CONF_yarn_scheduler_capacity_root_default_maximum___allocation___mb=8192
YARN_CONF_yarn_scheduler_capacity_root_default_maximum___allocation___vcores=4
YARN_CONF_yarn_resourcemanager_fs_state___store_uri=/rmstate
YARN_CONF_yarn_resourcemanager_system___metrics___publisher_enabled=true
YARN_CONF_yarn_resourcemanager_hostname=resourcemanager
YARN_CONF_yarn_resourcemanager_address=resourcemanager:8032
YARN_CONF_yarn_resourcemanager_scheduler_address=resourcemanager:8030
YARN_CONF_yarn_resourcemanager_resource__tracker_address=resourcemanager:8031
YARN_CONF_yarn_timeline___service_enabled=true
YARN_CONF_yarn_timeline___service_generic___application___history_enabled=true
YARN_CONF_yarn_timeline___service_hostname=historyserver
YARN_CONF_mapreduce_map_output_compress=true
YARN_CONF_mapred_map_output_compress_codec=org.apache.hadoop.io.compress.SnappyCodec
YARN_CONF_yarn_nodemanager_resource_memory___mb=16384
YARN_CONF_yarn_nodemanager_resource_cpu___vcores=8
YARN_CONF_yarn_nodemanager_disk___health___checker_max___disk___utilization___per___disk___percentage=98.5
YARN_CONF_yarn_nodemanager_remote___app___log___dir=/app-logs
YARN_CONF_yarn_nodemanager_aux___services=mapreduce_shuffle

MAPRED_CONF_mapreduce_framework_name=yarn
MAPRED_CONF_mapred_child_java_opts=-Xmx4096m
MAPRED_CONF_mapreduce_map_memory_mb=4096
MAPRED_CONF_mapreduce_reduce_memory_mb=8192
MAPRED_CONF_mapreduce_map_java_opts=-Xmx3072m
MAPRED_CONF_mapreduce_reduce_java_opts=-Xmx6144m
MAPRED_CONF_yarn_app_mapreduce_am_env=HADOOP_MAPRED_HOME=/Users/bennett/dev/hadoop/
MAPRED_CONF_mapreduce_map_env=HADOOP_MAPRED_HOME=/Users/bennett/dev/hadoop/
MAPRED_CONF_mapreduce_reduce_env=HADOOP_MAPRED_HOME=/Users/bennett/dev/hadoop/

and add hostname to /etc/hosts

127.0.0.1   localhost   namenode datanode1 datanode2 datanode3
::1             localhost namenode datanode1 datanode2 datanode3

this is my Java code(I just test, I don't use resource.)

Configuration conf = new Configuration();
conf.setBoolean("dfs.client.use.datanode.hostname", true);
conf.setBoolean("dfs.datanode.use.datanode.hostname", true);
conf.set("fs.defaultFS", "hdfs://namenode:9000");