Trex-Group / trex-bigdata

11 stars 6 forks source link

[Docker]关于Hadoop_docker.pdf的疑问 #27

Closed LiuMing5489 closed 7 years ago

LiuMing5489 commented 7 years ago

宿主机环境: macos sierra 10.12.3 docker community Edition 17.03

根据下面的手顺在macos下用docker(非vm)运行wordcount成功: https://github.com/trex-group/Big-Data/blob/master/01_Guide/environment/Manual/Hadoop_docker.pdf

问题: 根据上面的手顺,会创建2个关于hadoop的image

trex/hadoop-slave    latest              4cf860b5f9fd        35 hours ago        997 MB
trex/hadoop-master   latest              922dbfb1a249        35 hours ago        997 MB

启动3个容器master,slave1,slave2


d095f779090e        trex/hadoop-slave:latest    "/bin/sh -c ''/roo..."   30 minutes ago      Up 30 minutes       0.0.0.0:32824->22/tcp, 0.0.0.0:32823->7373/tcp, 0.0.0.0:32822->7946/tcp, 0.0.0.0:32821->8030/tcp, 0.0.0.0:32820->8031/tcp, 0.0.0.0:32819->8032/tcp, 0.0.0.0:32818->8033/tcp, 0.0.0.0:32817->8040/tcp, 0.0.0.0:32816->8042/tcp, 0.0.0.0:32815->8060/tcp, 0.0.0.0:32814->8088/tcp, 0.0.0.0:32813->9000/tcp, 0.0.0.0:32812->50010/tcp, 0.0.0.0:32811->50020/tcp, 0.0.0.0:32810->50060/tcp, 0.0.0.0:32809->50070/tcp, 0.0.0.0:32808->50075/tcp, 0.0.0.0:32807->50090/tcp, 0.0.0.0:32806->50475/tcp   slave2
5e7347663a41        trex/hadoop-slave:latest    "/bin/sh -c ''/roo..."   31 minutes ago      Up 30 minutes       0.0.0.0:32805->22/tcp, 0.0.0.0:32804->7373/tcp, 0.0.0.0:32803->7946/tcp, 0.0.0.0:32802->8030/tcp, 0.0.0.0:32801->8031/tcp, 0.0.0.0:32800->8032/tcp, 0.0.0.0:32799->8033/tcp, 0.0.0.0:32798->8040/tcp, 0.0.0.0:32797->8042/tcp, 0.0.0.0:32796->8060/tcp, 0.0.0.0:32795->8088/tcp, 0.0.0.0:32794->9000/tcp, 0.0.0.0:32793->50010/tcp, 0.0.0.0:32792->50020/tcp, 0.0.0.0:32791->50060/tcp, 0.0.0.0:32790->50070/tcp, 0.0.0.0:32789->50075/tcp, 0.0.0.0:32788->50090/tcp, 0.0.0.0:32787->50475/tcp   slave1
66894cdcc7c6        trex/hadoop-master:latest   "/bin/sh -c ''/roo..."   31 minutes ago      Up 30 minutes       0.0.0.0:32786->22/tcp, 0.0.0.0:32785->7373/tcp, 0.0.0.0:32784->7946/tcp, 0.0.0.0:32783->8030/tcp, 0.0.0.0:32782->8031/tcp, 0.0.0.0:32781->8032/tcp, 0.0.0.0:32780->8033/tcp, 0.0.0.0:32779->8040/tcp, 0.0.0.0:32778->8042/tcp, 0.0.0.0:32777->8060/tcp, 0.0.0.0:32776->8088/tcp, 0.0.0.0:32775->9000/tcp, 0.0.0.0:32774->50010/tcp, 0.0.0.0:32773->50020/tcp, 0.0.0.0:32772->50060/tcp, 0.0.0.0:32771->50070/tcp, 0.0.0.0:32770->50075/tcp, 0.0.0.0:32769->50090/tcp, 0.0.0.0:32768->50475/tcp   master

在master容器里面运行wordcount时,2个slave容器的进程没有变化。 而且master容器没有master文件,slaves文件里面是locahost

root@master:~# ll  $HADOOP_HOME/etc/hadoop/master
ls: cannot access /opt/hadoop/etc/hadoop/master: No such file or directory
root@master:~# more  $HADOOP_HOME/etc/hadoop/slaves
localhost

总感觉三个容器的内容好像是一样的啊? 是不是虽然有三个容器,但是不是完全分布模式,每个容器都是单机模式(伪分布模式)?

PS https://github.com/trex-group/Big-Data/blob/master/01_Guide/environment/docker/Hadoop_Ubuntu_Bin/start-hadoop-container.sh 启动脚本里面确实是启动了不同的image

# delete old master container and start new master container
# delete old slave containers and start new slave containers
LiuMing5489 commented 7 years ago

被名字误导了,没彻底理解老师的意图。

详细看了一下Dockerfile 和相关文件, Docker image的层构造是: ubuntu:14.04 - base-dnsmasq - hadoop-base - hadoop-master ubuntu:14.04 - base-dnsmasq - hadoop-base - hadoop-slave

这样的话,如果需要配置完全分布模式的hadoop, 修改(添加)下面两个文件夹的配置文件 Big-Data/01_Guide/environment/docker/Hadoop_Ubuntu_Bin/hadoop-master/files/hadoop/ Big-Data/01_Guide/environment/docker/Hadoop_Ubuntu_Bin/hadoop-slave/files/hadoop/

重新做成docker image: trex/hadoop-master,trex/hadoop-slave

这个思路对吧?

xenron commented 7 years ago

关于Docker image的层级构造,在课件中有一个图示:

ubuntu:14.04 - base-dnsmasq - hadoop-base - hadoop-master
ubuntu:14.04 - base-dnsmasq - hadoop-base - hadoop-slave

ubuntu:14.04 - base-dnsmasq - hadoop-base - hbase-base - hbase-master
ubuntu:14.04 - base-dnsmasq - hadoop-base - hbase-base - hbase-slave

hadoop 构筑一个hadoop-master,两个(或N个)hadoop-slave 确保通讯正常后,按照 README.md 中的说明,启动 hadoop 即可

hbase 构筑一个hbase-master,两个(或N个)hbase-slave 确保通讯正常后,按照 README.md 中的说明,启动 hadoop, hbase 即可

LiuMing5489 commented 7 years ago

继续挑战docker♪( ´θ`)ノ