apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.45k stars 2.43k forks source link

[SUPPORT] Docker Demo: Failed to Connect to namenode #1483

Closed malanb5 closed 4 years ago

malanb5 commented 4 years ago

Describe the problem you faced Failed to connect to server: namenode/172.19.0.5:8020: try once and fail when running the ./setup_demo.sh script.

To Reproduce

Steps to reproduce the behavior:

  1. Follow the setup per the Docker Demo
  2. Run the script ./setup_demo.sh

Expected behavior Connection to the namenode and the successful startup of Hudi.

Environment Description MacOS: 10.15.4 Docker: version 19.03.8, build afacb8b

Stacktrace

Creating network "compose_default" with the default driver
Creating zookeeper                 ... done
Creating namenode                  ... done
Creating kafkabroker               ... done
Creating hive-metastore-postgresql ... done
Creating hivemetastore             ... done
Creating historyserver             ... done
Creating datanode1                 ... done
Creating presto-coordinator-1      ... done
Creating hiveserver                ... done
Creating sparkmaster               ... done
Creating presto-worker-1           ... done
Creating spark-worker-1            ... done
Creating adhoc-2                   ... done
Creating adhoc-1                   ... done

Copying spark default config and setting up configs
20/04/03 17:48:13 WARN ipc.Client: Failed to connect to server: namenode/172.19.0.5:8020: try once and fail.
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:685)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788)
    at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550)
    at org.apache.hadoop.ipc.Client.call(Client.java:1381)
    at org.apache.hadoop.ipc.Client.call(Client.java:1345)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
    at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:796)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
    at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1649)
    at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1440)
    at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1437)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1437)
    at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:64)
    at org.apache.hadoop.fs.Globber.doGlob(Globber.java:269)
    at org.apache.hadoop.fs.Globber.glob(Globber.java:148)
    at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1686)
    at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:326)
    at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:245)
    at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:228)
    at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:103)
    at org.apache.hadoop.fs.shell.Command.run(Command.java:175)
    at org.apache.hadoop.fs.FsShell.run(FsShell.java:317)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:380)
mkdir: Call From adhoc-1/172.19.0.14 to namenode:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
copyFromLocal: `/var/demo/.': No such file or directory: `hdfs://namenode:8020/var/demo'
Copying spark default config and setting up configs
lamberken commented 4 years ago

Thanks for report this issue, what version of hudi do you use? can you share the output of docker ps

malanb5 commented 4 years ago

I'm a little bit of noob to hudi and the hadoop ecosystem in general so thank you for bearing with me.

The docker compose yaml file indicates that the version number is: 3.3, I found that many of the processes reference: apachehudi/hudi-hadoop_2.8.4-history:latest which I assume is the Hudi version being used in the demo?

Here is the output of docker ps.

CONTAINER ID        IMAGE                                                              COMMAND                  CREATED             STATUS                             PORTS                                                                                                                                                                                           NAMES
c6428f5ed413        apachehudi/hudi-hadoop_2.8.4-prestobase_0.217:latest               "entrypoint.sh worker"   7 seconds ago       Up Less than a second              0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp                                                                                           presto-worker-1
667e2ffc1cb1        apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkmaster_2.4.4:latest   "entrypoint.sh /bin/…"   7 seconds ago       Up Less than a second              0-1024/tcp, 4040/tcp, 5000-5100/tcp, 6066/tcp, 7000-7076/tcp, 0.0.0.0:7077->7077/tcp, 7078-8079/tcp, 8081-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:8080->8080/tcp   sparkmaster
31f4f30bbc54        apachehudi/hudi-hadoop_2.8.4-datanode:latest                       "/bin/bash /entrypoi…"   12 seconds ago      Up 6 seconds (health: starting)    0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-10100/tcp, 50000-50009/tcp, 0.0.0.0:50010->50010/tcp, 50011-50074/tcp, 50076-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:50075->50075/tcp     datanode1
abebc706e3f3        apachehudi/hudi-hadoop_2.8.4-hive_2.3.3:latest                     "entrypoint.sh /bin/…"   12 seconds ago      Up 6 seconds                       0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-9999/tcp, 10001-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:10000->10000/tcp                                                 hiveserver
e1103ccb2097        apachehudi/hudi-hadoop_2.8.4-prestobase_0.217:latest               "entrypoint.sh coord…"   12 seconds ago      Up 7 seconds                       0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-8089/tcp, 8091-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:8090->8090/tcp                                                    presto-coordinator-1
59717fef35ea        apachehudi/hudi-hadoop_2.8.4-history:latest                        "/bin/bash /entrypoi…"   15 seconds ago      Up 11 seconds (health: starting)   0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-8187/tcp, 8189-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:58188->8188/tcp                                                   historyserver
e60354b6d191        apachehudi/hudi-hadoop_2.8.4-hive_2.3.3:latest                     "entrypoint.sh /opt/…"   15 seconds ago      Up 11 seconds (health: starting)   0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-9082/tcp, 9084-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:9083->9083/tcp                                                    hivemetastore
16c4cdce76cf        apachehudi/hudi-hadoop_2.8.4-namenode:latest                       "/bin/bash /entrypoi…"   18 seconds ago      Up 14 seconds (health: starting)   0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-8019/tcp, 8021-10100/tcp, 0.0.0.0:8020->8020/tcp, 50000-50069/tcp, 50071-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:50070->50070/tcp         namenode
b8811253a180        bde2020/hive-metastore-postgresql:2.3.0                            "/docker-entrypoint.…"   18 seconds ago      Up 14 seconds                      5432/tcp                                                                                                                                                                                        hive-metastore-postgresql
b0ef4a787039        bitnami/zookeeper:3.4.12-r68                                       "/app-entrypoint.sh …"   18 seconds ago      Up 14 seconds                      2888/tcp, 0.0.0.0:2181->2181/tcp, 3888/tcp                                                                                                                                                      zookeeper
0eafd90cb012        bitnami/kafka:2.0.0                                                "/app-entrypoint.sh …"   18 seconds ago      Up 14 seconds                      0.0.0.0:9092->9092/tcp                                                                                                                                                                          kafkabroker
bhasudha commented 4 years ago

Hi @malanb5 I am not able to reproduce this. I am using

MacOS: 10.15.3 Docker: version 19.03.8

Have you already confirmed that there are no other ssh tunnels that are already running ? And the following settings are added to /etc/hosts

127.0.0.1 adhoc-1
127.0.0.1 adhoc-2
127.0.0.1 namenode
127.0.0.1 datanode1
127.0.0.1 hiveserver
127.0.0.1 hivemetastore
127.0.0.1 kafkabroker
127.0.0.1 sparkmaster
127.0.0.1 zookeeper
malanb5 commented 4 years ago

@bhasudha Here's the output to nmap:

sudo nmap -sT -sU 127.0.0.1
Starting Nmap 7.80 ( https://nmap.org ) at 2020-04-03 19:42 PDT
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00026s latency).
Not shown: 1958 closed ports, 30 filtered ports
PORT     STATE         SERVICE
22/tcp   open          ssh
445/tcp  open          microsoft-ds
3031/tcp open          eppc
3283/tcp open          netassistant
3306/tcp open          mysql
5900/tcp open          vnc
8888/tcp open          sun-answerbook
88/udp   open|filtered kerberos-sec
137/udp  open|filtered netbios-ns
138/udp  open|filtered netbios-dgm
3283/udp open          netassistant
5353/udp open          zeroconf

Here's what's in my /etc/hosts

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
127.0.0.1   localhost
255.255.255.255 broadcasthost
::1             localhost

127.0.0.1 adhoc-1
127.0.0.1 adhoc-2
127.0.0.1 namenode
127.0.0.1 datanode1
127.0.0.1 hiveserver
127.0.0.1 hivemetastore
127.0.0.1 kafkabroker
127.0.0.1 sparkmaster
127.0.0.1 zookeeper

# Added by Docker Desktop
# To allow the same kube context to work on the host and the container:
127.0.0.1 kubernetes.docker.internal
# End of section
malanb5 commented 4 years ago

I didn't include it in my original description but I also do have a local install of mysql, hive, hadoop, and spark

lamberken commented 4 years ago

if so, it may be caused by port conflict. Can you stop local hadoop and try anagin if possible? : )

image

malanb5 commented 4 years ago

Yep that did it! Thanks for the help @lamber-ken and @bhasudha

arunb2w commented 2 years ago

@malanb5 @lamberken I am also facing the same issue and below is the output of docker ps.

╰─ docker ps       
CONTAINER ID   IMAGE                                                              COMMAND                  CREATED       STATUS                 PORTS                                                                                                                                                                                           NAMES
19e87256b38e   apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkadhoc_2.4.4:latest    "entrypoint.sh /bin/…"   2 hours ago   Up 2 hours             0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp                                                                                           adhoc-2
4c93e072cc5d   apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkadhoc_2.4.4:latest    "entrypoint.sh /bin/…"   2 hours ago   Up 2 hours             0-1024/tcp, 5000-5100/tcp, 7000-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:4040->4040/tcp                                                                             adhoc-1
5fc03594a511   apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkworker_2.4.4:latest   "entrypoint.sh /bin/…"   2 hours ago   Up 2 hours             0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-8080/tcp, 8082-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:8081->8081/tcp                                                    spark-worker-1
bde1234aa587   apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkmaster_2.4.4:latest   "entrypoint.sh /bin/…"   2 hours ago   Up 2 hours             0-1024/tcp, 4040/tcp, 5000-5100/tcp, 6066/tcp, 7000-7076/tcp, 0.0.0.0:7077->7077/tcp, 7078-8079/tcp, 8081-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:8080->8080/tcp   sparkmaster
e2c6132395f9   apachehudi/hudi-hadoop_2.8.4-trinoworker_368:latest                "./scripts/trino.sh …"   2 hours ago   Up 2 hours             0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-8091/tcp, 8093-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:8092->8092/tcp                                                    trino-worker-1
b9f15382e25e   apachehudi/hudi-hadoop_2.8.4-trinocoordinator_368:latest           "./scripts/trino.sh …"   2 hours ago   Up 2 hours             0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-8090/tcp, 8092-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:8091->8091/tcp                                                    trino-coordinator-1
5ce4159c3a8e   apachehudi/hudi-hadoop_2.8.4-datanode:latest                       "/bin/bash /entrypoi…"   2 hours ago   Up 2 hours (healthy)   0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-10100/tcp, 50000-50009/tcp, 0.0.0.0:50010->50010/tcp, 50011-50074/tcp, 50076-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:50075->50075/tcp     datanode1
f5236676e754   apachehudi/hudi-hadoop_2.8.4-history:latest                        "/bin/bash /entrypoi…"   2 hours ago   Up 2 hours (healthy)   0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-8187/tcp, 8189-10100/tcp, 50000-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:58188->8188/tcp                                                   historyserver
d27e2087636c   apachehudi/hudi-hadoop_2.8.4-namenode:latest                       "/bin/bash /entrypoi…"   2 hours ago   Up 2 hours (healthy)   0-1024/tcp, 4040/tcp, 5000-5100/tcp, 7000-8019/tcp, 8021-10100/tcp, 0.0.0.0:8020->8020/tcp, 50000-50069/tcp, 50071-50200/tcp, 58042/tcp, 58088/tcp, 58188/tcp, 0.0.0.0:50070->50070/tcp         namenode
8ec736d28ba0   bitnami/kafka:2.0.0                                                "/app-entrypoint.sh …"   2 hours ago   Up 2 hours             0.0.0.0:9092->9092/tcp                                                                                                                                                                          kafkabroker
2b18623cb771   graphiteapp/graphite-statsd                                        "/entrypoint"            2 hours ago   Up 2 hours             0.0.0.0:80->80/tcp, 2013-2014/tcp, 2023-2024/tcp, 8080/tcp, 0.0.0.0:2003-2004->2003-2004/tcp, 0.0.0.0:8126->8126/tcp, 8125/tcp, 8125/udp                                                        graphite
5975817234ba   bitnami/zookeeper:3.4.12-r68                                       "/app-entrypoint.sh …"   2 hours ago   Up 2 hours             2888/tcp, 0.0.0.0:2181->2181/tcp, 3888/tcp                                                                                                                                                      zookeeper

I saw comment above from @lamberken to stop the local hadoop but am not sure whether are there anything running on my local.

Can you please guide me on how to resolve this.

Thanks