docker-archive / classicswarm

Swarm Classic: a container clustering system. Not to be confused with Docker Swarm which is at https://github.com/docker/swarmkit
Apache License 2.0
5.75k stars 1.08k forks source link

docker swarm manager doesn't join on proper ip:port #2975

Closed lalith-b closed 4 years ago

lalith-b commented 4 years ago

I setup a simple 3 node cluster on virtualbox/vagrant across 2 different physical machines and ping/port-forwarding all setup and using ipaliases. Each Vagrant/Virtualbox machine will have a IP_ALIAS:PORT combination.

docker version info:

Client: Docker Engine - Community
 Version:           19.03.8
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        afacb8b7f0
 Built:             Wed Mar 11 01:25:58 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.8
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       afacb8b7f0
  Built:            Wed Mar 11 01:24:30 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

I initialize the cluster with docker swarm init --advertise-addr {{adv_addr}} --listen-addr {{adv_addr}} on the Master node.

1 Master 1 Manager 1 Worker

Master gave two tokens, one for worker join and another for manager join, worker join works fine in the setup.

If i do docker swarm leave for the worker and try joining in as manager it fails with below error.

FROM: MANAGER NODE
root@105-145-38-166:/home/vagrant#
root@105-145-38-166:/home/vagrant# docker swarm join --token SWMTKN-1-0tfqyzi1zh47yzvidw5pekqky5qzuhzbaopos87xrwb6rll9tf-1v64snh1z9fadset0e6n4kpqa 105.145.38.75:2377
Error response from daemon: manager stopped: can't initialize raft node: rpc error: code = Unknown desc = could not connect to prospective new cluster member using its advertised address: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 10.0.2.2:2377: connect: connection refused"
root@105-145-38-166:/home/vagrant#
root@105-145-38-166:/home/vagrant# docker swarm join --token SWMTKN-1-0tfqyzi1zh47yzvidw5pekqky5qzuhzbaopos87xrwb6rll9tf-0x5nkgmb0k0bgjoh7esi8cjqk 105.145.38.75:2377
This node joined a swarm as a worker.
root@105-145-38-166:/home/vagrant#
root@105-145-38-166:/home/vagrant#

why is it dialing to 10.0.2.2:2377, should it not be trying 105.145.38.131:2377 and it should bind to local port with 0.0.0.0:2377 as a manager, is there anyway to configure/control this? (10.0.2.2 is the gateway inside the vagrant/virtualbox machine).

justincormack commented 4 years ago

This repo is not for Docker swarm you are looking for https://github.com/docker/swarmkit