Closed minkyuSnow closed 1 year ago
Hello,
It seems that the worker node is not detected by the master node. By default, the master node is listening on the IP address of your hostname. In this case, you can try to change the worker's parameter `--master-ip
to the IP address of the hostname and see whether it works.
Hello,
It seems that the worker node is not detected by the master node. By default, the master node is listening on the IP address of your hostname. In this case, you can try to change the worker's parameter
`--master-ip
to the IP address of the hostname and see whether it works.
Thank you for your response.
I'm sorry, but I don't understand.
worker node $ docker docker run -d --net multi-net --name data-slave01 cloudsuite/data-analytics --slave --master-ip=x.x.x.x I'm not sure how you're telling me to write the --master-ip=x.x.x.x part.
I checked the master container IP with the overlay mutinet network inspect command and entered it in the x.x.x.x part. I also tried entering the actual IP of the master node.
I'm trying several things to fix this, including removing the --master-ip part and just entering the IP. But I don't quite understand, so I'm asking.
master container ip -> 10.0.1.2 master node ip -> 172.30.1.40 master hostname -> pi1
worker node ip -> 172.30.1.41 worker hostname -> pi2
Hello,
Sorry if I cause any confusion. My hypothesis to your problem is that the master node may not listen on a correct NIC. By default, the master node is using hostname as its address, and in your case it is pi1. You can try to ping pi1
from the worker node to see if they have connection.
If not, you can explicitly let the master node listen on an IP address by passing --master-ip=<x.x.x.x>
to the master node. To be more specific, if you use container's network, it would be --master-ip=10.0.1.2
. If you are using host network, it would be --master-ip=172.30.1.40
Thank you for your response.
We performed a ping test between PI1 and PI2 and found that the ping is connected.
The container ping test was also connected to the inside of the container, and the package update was carried out, and the ping tool was installed and tested, and it was confirmed that the master container and the worker container were connected.
Prior to the documentation and version changes, we continued with the overlay network and ran normally.
Hello,
We performed a ping test between PI1 and PI2 and found that the ping is connected.
May I know how you ping pi1
and pi2
? My intention is to check whether both pi1
and pi2
could resolve the name pi1
to its correct address, because on some distribution like Ubuntu, a machine would resolve its hostname as 127.0.0.1
, which could cause the server start locally, instead of being broadcast.
Moreover, you are always encoraged to use host
network when possible. I see you didn't have forward progress when using host
network, this might be caused that your worker node does not have enough disk space: Hadoop requires the worker node to have less than 90% disk usages. You can use df
to check the disk usage.
Best,
Hello,
We performed a ping test between PI1 and PI2 and found that the ping is connected.
May I know how you ping
pi1
andpi2
? My intention is to check whether bothpi1
andpi2
could resolve the namepi1
to its correct address, because on some distribution like Ubuntu, a machine would resolve its hostname as127.0.0.1
, which could cause the server start locally, instead of being broadcast.Moreover, you are always encoraged to use
host
network when possible. I see you didn't have forward progress when usinghost
network, this might be caused that your worker node does not have enough disk space: Hadoop requires the worker node to have less than 90% disk usages. You can usedf
to check the disk usage.Best,
Thank yuu for your response.
For the ping test, I went into the container and typed "apt-get update && apt-get install -y iputils-ping" and tested pinging each node IP (172.30.1.x).
You mean I need to modify my hosts file? Do you mean to modify the host file of pi1 as follows? 127.0.0.1 pi1
And host file of pi2 too 127.0.0.1 pi2
pi1 -> cat /etc/hosts
pi2 -> cat /etc/hosts
And There is plenty of space on the disk. pi1 Disk
pi2 Disk
Hello,
Thanks for your reply.
Yes, my intention is to let you check which IP is mapped to the hostname. By default, data analytics server listens on the IP mapped to the hostname, which in your case is 127.0.0.1. This means the server only accept clients from the local machine.
The way to fix this problem is to force the Hadoop server to listen on an active IP address, which, in your case, is 172.30.1.40
(using host network). You can pass --master-ip=172.30.1.40
to the server to achieve this goal, then you can retry to see if there is any problem.
Best,
Hello,
Thanks for your reply.
Yes, my intention is to let you check which IP is mapped to the hostname. By default, data analytics server listens on the IP mapped to the hostname, which in your case is 127.0.0.1. This means the server only accept clients from the local machine.
The way to fix this problem is to force the Hadoop server to listen on an active IP address, which, in your case, is
172.30.1.40
(using host network). You can pass--master-ip=172.30.1.40
to the server to achieve this goal, then you can retry to see if there is any problem.Best,
Thank you for your reply.
You mentioned that the IP mapping of hadoop is not working.
So I searched for Hadoop IP mapping and it seemed like the problem was that I needed to store the IPs and hostnames of the nodes I wanted to cluster in the /etc/hosts file, so I did.
pi1(node 1)
pi2(node 2)
The benchmark appears to be working fine.
Master and slave seem to be communicating. After modifying the hosts file, are the master and slave communicating normally? I would like to know if the benchmarks are working properly. And whenever I add a node, I just keep adding the IP and hostname of the node I'm adding to the hosts file and run it?
Continue adding the IPs and hostnames of the master and slave to the hosts file on each node, saving them as you go. The contents of the hosts file should be identical. -> Is it correct?
Set the hosts file for each node For example
pi1(node 1) 127.0.0.1 localhost
172.30.1.40 pi1 172.30.1.41 pi2 172.30.1.42 pi3 -> add node 3
pi2(node 2) 127.0.0.1 localhost
172.30.1.40 pi1 172.30.1.41 pi2 172.30.1.42 pi3 -> add node 3
pi3(node 3) 127.0.0.1 localhost
172.30.1.40 pi1 172.30.1.41 pi2 172.30.1.42 pi3 -> add node 3
Hello,
Glad to see it works.
It is also pretty strange that overriding the master IP address does not work. Currently, I think you have to manually modify the hosts file to make the master aware of the worker nodes. We would check if we can fix this bug by forcing Hadoop use IP address for communication.
Best,
Hello,
Glad to see it works.
It is also pretty strange that overriding the master IP address does not work. Currently, I think you have to manually modify the hosts file to make the master aware of the worker nodes. We would check if we can fix this bug by forcing Hadoop use IP address for communication.
Best,
It will run as normal. Thank you very much for your help.
Hello
The Data Analytics benchmark application has changed this time, so I thought I'd give it a shot. I ran it once on a single node and tried it on a multi-node.
However, I ask because when I ran it on a single node, it worked fine, but not on a multi-node.
I ran it on an Arm64 environment.
$ docker run -d --net single-net --volumes-from wikimedia-dataset --name data-master cloudsuite/data-analytics --master $ docker run -d --net single-net --name data-slave01 cloudsuite/data-analytics --slave --master-ip=data-master container IP $ docker exec data-master benchmark
Master Node
Worker Node
Execution errors
To summarize, before the change, data analytics was running with the overlay network, but now it is not running with an error message, and if I change it to host and test it, it stops running.