Closed zbears closed 9 years ago
What do you mean under - on multiple computers on multiple networks
? 2.0.0 (the serf based version) should be able to create clusters on one physical host. Though you can actually make it work on multiple hosts we don't support it with serf
- but the consul
based branches. Unluckily those are not meant to work (we did not have time to update the shell and script) with the script - but we use it from Cloudbreak.
Sometimes in the future we will get rid of all serf
based Docker images and use consul
- as we do it with Cloudbreak - and update the scripts/shell.
I just quickly installed a 2 node cluster and I don't have any alerts. Can you exec into the amb0
container and try to ping amb1.mycorp.kom
I apologize for the confusion. I meant that I had tried the serf-based approach on a single computer. When that didn't work, I tried again on a few others.
If you click on the hosts tab, do you get anything next to amb1.mycorp.kom and amb2.mycorp.kom?
It appeared fine from the main screen but the problem emerged in the hosts area
I don't have any, you can login here admin/admin http://2eb016cf.ngrok.com
Can you briefly describe you environment? Host OS, Docker version etc.
The ping works fine but ping uses ICMP whereas TCP uses a different protocol
I'm running Docker 1.7.0 on Ubuntu 14.0.4.
I've also tried on Ubuntu 12.04.5 with no luck
What was the command that you ran to get your 2 node blueprint? It shouldn't affect anything but the default creates a 3 node cluster.
amb-deploy-cluster 2
For quick fix you can try to write the hostnames to /etc/hosts in the container, otherwise we need to be able to reproduce it to see what's wrong.
Strangely enough, the hostname is in the /etc/hosts file of the container. I'm trying a 2 node cluster now to see if that changes anything. If ping is working but TCP is not, it makes me think that it has something to do with Docker's treatment of ports. I'll update you as soon as the cluster is installed.
It appears like the problem is not with the slaves connecting to the master but rather the slaves connecting to their own locally run services.
Did you check whether the services are actually running? Also check the ambari-server.log to see if it's just and alert issue or not.
The Ambari dashboard says they are running. Should I be looking elsewhere? I just ran the 2-node system. Everything looks great for about a minute then I start getting the same TCP errors.
Okay. Turns out it was a master-slave communication thing. After adding the slave manually to the master's /etc/hosts file and restarting the service, it appears everything has been fixed. Is this a configuration step that must be run every time the script is started or was it something weird with my particular experience?
This version is out for a while and no one reported such issues which doesn't mean it doesn't exists, but could be environment related.
Okay. Thank you so much for your prompt help. I'll try to get this fix incorporated into the script just in case anyone else has the same problem.
I have tried your one-step script for cluster creation on multiple computers on multiple networks. Although cluster creation works great, when looking at the nodes in the cluster from the ambari manager page, the nodes appear to have issues connecting to their own services. Is there a fix for this?