moby / swarmkit

A toolkit for orchestrating distributed systems at any scale. It includes primitives for node discovery, raft-based consensus, task scheduling and more.
Apache License 2.0
3.34k stars 611 forks source link

Docker swarm on freshly installed 3 vm's initialised properly but 2377 does not open => workers or managers cant join the cluster #3157

Open cazacubogdan opened 10 months ago

cazacubogdan commented 10 months ago

Description architecture: 3 hosts running proxmox. - simple networking - not even vlans. 3 vm's running ubuntu 22.04.2 (one for each proxmox hosts) any of the 3 vm's if i initialise a docker swarm, it initialised properly but no 2377 port is opened.

LE: no firewalls, no proxies, no ids/ips/snort or sniffers, no wifi, just simple LAN with all hosts in one network. LE2: python web server starts on 2377 even if docker is still running.

kuber@kuber-pl-01:~$ sudo netstat -tulpn Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:22 0.0.0.0: LISTEN - tcp 0 0 127.0.0.53:53 0.0.0.0: LISTEN - tcp 0 0 127.0.0.1:61209 0.0.0.0: LISTEN - tcp6 0 0 :::22 ::: LISTEN - udp 0 0 127.0.0.53:53 0.0.0.0: - udp 0 0 192.168.0.30:68 0.0.0.0: -

kuber@kuber-pl-02:/etc/systemd/system$ nc -vz 192.168.0.30 2377 nc: connect to 192.168.0.30 port 2377 (tcp) failed: Connection refused

kuber@kuber-pl-01:~$ docker swarm init --listen-addr=192.168.0.30:2377 Error response from daemon: manager stopped: failed to listen on remote API address: listen tcp 192.168.0.30:2377: bind: cannot assign requested address

but! if i try with: kuber@kuber-pl-01:~$ docker swarm init --advertise-addr=192.168.0.30:2377 Swarm initialized: current node (s9s7qjsddyr8015ionvdvzzdt) is now a manager.

To add a worker to this swarm, run the following command:

docker swarm join --token SWMTKN-1-496gjmns84wrz9mz7zw7qwoygrk7isw1am3xipj6oe0lvamtqy-d605pswgvnox1bzx48cahapxq 192.168.0.30:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

results on any of the vm's kuber@kuber-pl-03:~$ docker swarm join --token SWMTKN-1-496gjmns84wrz9mz7zw7qwoygrk7isw1am3xipj6oe0lvamtqy-b4dxahq23o422zzlzlqw6g5qz 192.168.0.30:2377 Error response from daemon: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 192.168.0.30:2377: connect: connection refused"

kuber@kuber-pl-01:~$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION s9s7qjsddyr8015ionvdvzzdt * kuber-pl-01 Ready Active Leader 24.0.7

same version on all vm's Reproduce see above

Expected behavior completion of adding a manager to the swarm

docker version kuber@kuber-pl-01:~$ docker version Client: Docker Engine - Community Version: 24.0.7 API version: 1.43 Go version: go1.20.10 Git commit: afdd53b Built: Thu Oct 26 09:07:41 2023 OS/Arch: linux/amd64 Context: rootless

Server: Docker Engine - Community Engine: Version: 24.0.7 API version: 1.43 (minimum version 1.12) Go version: go1.20.10 Git commit: 311b9ff Built: Thu Oct 26 09:07:41 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.24 GitCommit: 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523 runc: Version: 1.1.9 GitCommit: v1.1.9-0-gccaecfc docker-init: Version: 0.19.0 GitCommit: de40ad0 rootlesskit: Version: 1.1.1 ApiVersion: 1.1.1 NetworkDriver: slirp4netns PortDriver: builtin StateDir: /tmp/rootlesskit1430401958 slirp4netns: Version: 1.0.1 GitCommit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4 docker info kuber@kuber-pl-01:~$ docker info Client: Docker Engine - Community Version: 24.0.7 Context: rootless Debug Mode: false Plugins: buildx: Docker Buildx (Docker Inc.) Version: v0.11.2 Path: /usr/libexec/docker/cli-plugins/docker-buildx compose: Docker Compose (Docker Inc.) Version: v2.21.0 Path: /usr/libexec/docker/cli-plugins/docker-compose

Server: Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 0 Server Version: 24.0.7 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Using metacopy: false Native Overlay Diff: false userxattr: true Logging Driver: json-file Cgroup Driver: systemd Cgroup Version: 2 Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: active NodeID: s9s7qjsddyr8015ionvdvzzdt Is Manager: true ClusterID: fy0um5tvoqif1j81e4p8ind2e Managers: 1 Nodes: 1 Default Address Pool: 10.0.0.0/8 SubnetSize: 24 Data Path Port: 4789 Orchestration: Task History Retention Limit: 5 Raft: Snapshot Interval: 10000 Number of Old Snapshots to Retain: 0 Heartbeat Tick: 1 Election Tick: 10 Dispatcher: Heartbeat Period: 5 seconds CA Configuration: Expiry Duration: 3 months Force Rotate: 0 Autolock Managers: false Root Rotation In Progress: false Node Address: 192.168.0.30 Manager Addresses: 192.168.0.30:2377 Runtimes: io.containerd.runc.v2 runc Default Runtime: runc Init Binary: docker-init containerd version: 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523 runc version: v1.1.9-0-gccaecfc init version: de40ad0 Security Options: seccomp Profile: builtin rootless cgroupns Kernel Version: 5.15.0-88-generic Operating System: Ubuntu 22.04.3 LTS OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 4.394GiB Name: kuber-pl-01 ID: 4c21e420-eb02-4288-8347-1afc5fe6663e Docker Root Dir: /home/kuber/.local/share/docker Debug Mode: false Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false

WARNING: No cpu cfs quota support WARNING: No cpu cfs period support WARNING: No cpu shares support WARNING: No cpuset support WARNING: No io.weight support WARNING: No io.weight (per device) support WARNING: No io.max (rbps) support WARNING: No io.max (wbps) support WARNING: No io.max (riops) support WARNING: No io.max (wiops) support Diagnostics ID lll

Additional Info kuber@kuber-pl-01:/opt/containerd$ sudo service ufw status ○ ufw.service - Uncomplicated firewall Loaded: loaded (/lib/systemd/system/ufw.service; enabled; vendor preset: enabled) Active: inactive (dead) since Wed 2023-11-08 18:23:13 UTC; 27min ago Docs: man:ufw(8) Main PID: 621 (code=exited, status=0/SUCCESS) CPU: 2ms

Nov 08 18:16:29 kuber-pl-01 systemd[1]: Starting Uncomplicated firewall... Nov 08 18:16:29 kuber-pl-01 systemd[1]: Finished Uncomplicated firewall. Nov 08 18:23:13 kuber-pl-01 systemd[1]: Stopping Uncomplicated firewall... Nov 08 18:23:13 kuber-pl-01 ufw-init[2570]: Skip stopping firewall: ufw (not enabled) Nov 08 18:23:13 kuber-pl-01 systemd[1]: ufw.service: Deactivated successfully. Nov 08 18:23:13 kuber-pl-01 systemd[1]: Stopped Uncomplicated firewall.

so its not the firewall either...

cazacubogdan commented 9 months ago

update: issue present only in ubuntu 22.04.2. 22.04 and 20.04 don't have this issue

s4ke commented 8 months ago

Which Base Image are you using? I think I saw recently some people having issues with a KVM specific image on Proxmox with Swarm.

cazacubogdan commented 7 months ago

i'm using the server install img from ubuntu...