sequenceiq / hadoop-docker

Hadoop docker image
https://registry.hub.docker.com/u/sequenceiq/hadoop-docker/
Apache License 2.0
1.21k stars 560 forks source link

Port 9000 (namenode IPC) not exposed #48

Open thiagofigueiro opened 8 years ago

thiagofigueiro commented 8 years ago

If one wants to access HDFS from another container port 9000 needs to be exposed.

Output of docker ps:

CONTAINER ID        IMAGE                            COMMAND                  CREATED             STATUS              PORTS                                                                                                                                                                                                                                                                                                                                                                                          NAMES
3835390f5f2a        flume                            "start-flume"            15 minutes ago      Up 15 minutes                                                                                                                                                                                                                                                                                                                                                                                                      flume-example-a1
2ff5c8467ddc        sequenceiq/hadoop-docker:2.7.0   "/etc/bootstrap.sh -d"   4 hours ago         Up 4 hours          0.0.0.0:32828->2122/tcp, 0.0.0.0:32827->8030/tcp, 0.0.0.0:32826->8031/tcp, 0.0.0.0:32825->8032/tcp, 0.0.0.0:32824->8033/tcp, 0.0.0.0:32823->8040/tcp, 0.0.0.0:32822->8042/tcp, 0.0.0.0:32821->8088/tcp, 0.0.0.0:32820->19888/tcp, 0.0.0.0:32819->49707/tcp, 0.0.0.0:32818->50010/tcp, 0.0.0.0:32817->50020/tcp, 0.0.0.0:32816->50070/tcp, 0.0.0.0:32815->50075/tcp, 0.0.0.0:32814->50090/tcp   hadoop

Output of docker port 2ff5c8467ddc | sort -n

2122/tcp -> 0.0.0.0:32828
8030/tcp -> 0.0.0.0:32827
8031/tcp -> 0.0.0.0:32826
8032/tcp -> 0.0.0.0:32825
8033/tcp -> 0.0.0.0:32824
8040/tcp -> 0.0.0.0:32823
8042/tcp -> 0.0.0.0:32822
8088/tcp -> 0.0.0.0:32821
19888/tcp -> 0.0.0.0:32820
49707/tcp -> 0.0.0.0:32819
50010/tcp -> 0.0.0.0:32818
50020/tcp -> 0.0.0.0:32817
50070/tcp -> 0.0.0.0:32816
50075/tcp -> 0.0.0.0:32815
50090/tcp -> 0.0.0.0:32814

Contents of /usr/local/hadoop/etc/hadoop/core-site.xml in the container:

  <configuration>
      <property>
          <name>fs.defaultFS</name>
          <value>hdfs://2ff5c8467ddc:9000</value>
      </property>
  </configuration>

I believe the above should be localhost or 127.0.0.1. Running netstat will confirm the port is bound to another IP address:

bash-4.1# netstat -antp | grep LISTEN | grep :9000
tcp        0      0 172.17.0.2:9000             0.0.0.0:*                   LISTEN      130/java
thiagofigueiro commented 8 years ago

This is odd. I built the image locally from HEAD of master instead of using the image from hub.docker.com and port 9000 is now exposed.

I can write to HDFS using another container:

2016-03-28 09:45:22,120 (hdfs-k2-call-runner-8) [INFO - org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:629)] Renaming hdfs://hadoop:9000/flume/events/16-03-28/0940/00/FlumeData.1459158311274.tmp to hdfs://hadoop:9000/flume/events/16-03-28/0940/00/FlumeData.1459158311274

I'll delete the images and pull from hub.docker.com again to confirm the problem is only with this source.

thiagofigueiro commented 8 years ago

Confirming that the image from hub.docker.com doesn't expose port 9000. This time I used tag 2.7.1:

CONTAINER ID        IMAGE                            COMMAND                  CREATED              STATUS              PORTS                                                                                                                                                                                                                                                                                                                                                                                          NAMES
8dc1024736fa        sequenceiq/hadoop-docker:2.7.1   "/etc/bootstrap.sh -d"   About a minute ago   Up About a minute   0.0.0.0:32877->2122/tcp, 0.0.0.0:32876->8030/tcp, 0.0.0.0:32875->8031/tcp, 0.0.0.0:32874->8032/tcp, 0.0.0.0:32873->8033/tcp, 0.0.0.0:32872->8040/tcp, 0.0.0.0:32871->8042/tcp, 0.0.0.0:32870->8088/tcp, 0.0.0.0:32869->19888/tcp, 0.0.0.0:32868->49707/tcp, 0.0.0.0:32867->50010/tcp, 0.0.0.0:32866->50020/tcp, 0.0.0.0:32865->50070/tcp, 0.0.0.0:32864->50075/tcp, 0.0.0.0:32863->50090/tcp   hadoop
$ docker port hadoop | sort -n
2122/tcp -> 0.0.0.0:32877
8030/tcp -> 0.0.0.0:32876
8031/tcp -> 0.0.0.0:32875
8032/tcp -> 0.0.0.0:32874
8033/tcp -> 0.0.0.0:32873
8040/tcp -> 0.0.0.0:32872
8042/tcp -> 0.0.0.0:32871
8088/tcp -> 0.0.0.0:32870
19888/tcp -> 0.0.0.0:32869
49707/tcp -> 0.0.0.0:32868
50010/tcp -> 0.0.0.0:32867
50020/tcp -> 0.0.0.0:32866
50070/tcp -> 0.0.0.0:32865
50075/tcp -> 0.0.0.0:32864
50090/tcp -> 0.0.0.0:32863
thiagofigueiro commented 8 years ago

It looks like port 9000 was added by 18d968cc3d39e427357de0bb1878c5c8d0c528c7 and the 2.7.1 build doesn't have it.

There are no builds more recent than 2.7.1 so if someone hits the Trigger button in hub.docker this should be fixed.

Starofall commented 7 years ago

I just had the same problem. The Dockerfile binds the port to the wrong interface. The solution was to change the core-site.xml:

 <configuration>
      <property>
          <name>fs.defaultFS</name>
          <value>hdfs://0.0.0.0:9000</value>
      </property>
  </configuration>
thiagofigueiro commented 7 years ago

@Starofall, thanks for the contribution. One can also build locally using a later hash (HEAD worked for me).

eximius313 commented 6 years ago

After hours of struggle I finally found this issue and this:

/usr/local/hadoop/sbin/stop-yarn.sh
/usr/local/hadoop/sbin/stop-dfs.sh
vi /usr/local/hadoop/etc/hadoop/core-site.xml

 <configuration>
      <property>
          <name>fs.defaultFS</name>
          <value>hdfs://0.0.0.0:9000</value>
      </property>
  </configuration>
/usr/local/hadoop/sbin/start-dfs.sh
/usr/local/hadoop/sbin/start-yarn.sh

finally worked!

Could you please change it in your container? You could save people a lot of waisted hours...

thiagofigueiro commented 6 years ago

@eximius313 this project looks abandoned. Maybe you could fork it and do the update? I haven't been using this, so I'd be a poor maintainer.