moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.23k stars 18.58k forks source link

Swarm: "failed to remove node" and "error creating cluster object" on daemon stop/start on 18.09.0 #38189

Open nscheer opened 5 years ago

nscheer commented 5 years ago

After a fresh install of 18.09.0 on a CentOS 7.5 box, the following errors appear in the log:

On daemon stop:

Nov 12 11:21:48 zbox.home.scheer.it systemd[1]: Stopping Docker Application Container Engine...
Nov 12 11:21:48 zbox.home.scheer.it dockerd[14665]: time="2018-11-12T11:21:48.290521093+01:00" level=error msg="failed to remove node" error="rpc error: code = Aborted desc = dispatcher is stopped" method="(*Dispatcher).Session" node.id=uenu4ti6d2kblu64mj7c5wzd8 node.session=p1jtrug9ak8syomkpt8o70xg9
Nov 12 11:21:48 zbox.home.scheer.it dockerd[14665]: time="2018-11-12T11:21:48.291171406+01:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {/var/run/docker/swarm/control.sock 0  <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /var/run/docker/swarm/control.sock: connect: no such file or directory\". Reconnecting..." module=grpc
Nov 12 11:21:48 zbox.home.scheer.it dockerd[14665]: time="2018-11-12T11:21:48.295434341+01:00" level=warning msg="grpc: addrConn.transportMonitor exits due to: grpc: the connection is closing" module=grpc
Nov 12 11:21:48 zbox.home.scheer.it dockerd[14665]: time="2018-11-12T11:21:48.295171450+01:00" level=error msg="failed to receive changes from store watch API" error="rpc error: code = Unknown desc = context canceled"
Nov 12 11:21:48 zbox.home.scheer.it dockerd[14665]: time="2018-11-12T11:21:48.514692693+01:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint iie3upm3zov7wvipz23wv6x6n 2f1560fa4e909d2a89ac80882ba4e37471c0735ed174086bcafc2dda7e739f96], retrying...."
Nov 12 11:21:48 zbox.home.scheer.it systemd[1]: Stopped Docker Application Container Engine.

On daemon start:

Nov 12 11:22:04 zbox.home.scheer.it systemd[1]: Starting Docker Application Container Engine...
Nov 12 11:22:06 zbox.home.scheer.it dockerd[14910]: time="2018-11-12T11:22:06.204174317+01:00" level=error msg="error creating cluster object" error="name conflicts with an existing object" module=node node.id=uenu4ti6d2kblu64mj7c5wzd8
Nov 12 11:22:06 zbox.home.scheer.it systemd[1]: Started Docker Application Container Engine.

Steps to reproduce the issue:

  1. Fresh install using yum repo for centos
  2. systemctl start docker
  3. docker swarm init
  4. systemctl stop docker
  5. systemctl start docker

Describe the results you received:

Error messages as depicted above.

Describe the results you expected:

No errors :)

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client:
 Version:           18.09.0
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        4d60db4
 Built:             Wed Nov  7 00:48:22 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.0
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.4
  Git commit:       4d60db4
  Built:            Wed Nov  7 00:19:08 2018
  OS/Arch:          linux
 Experimental:      false

Output of docker info:

Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 18.09.0
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
 NodeID: uenu4ti6d2kblu64mj7c5wzd8
 Is Manager: true
 ClusterID: rumqgnxxukr3qfk7t9sg58iz1
 Managers: 1
 Nodes: 1
 Default Address Pool: 10.0.0.0/8
 SubnetSize: 24
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 10.0.0.8
 Manager Addresses:
  10.0.0.8:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: d6de12e2f362cb9dc49ad957911996d3de59b338
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-862.14.4.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.564GiB
Name: zbox.home.scheer.it
ID: 2ARX:ECAY:USKX:PWG3:4XDU:V564:Z5D4:ZVCY:VNIV:3ETA:Q3JX:7Y7P
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

Note: These errors are emitted on every start/stop, not just once.

thaJeztah commented 5 years ago

Hm; feels like a race condition; swamkit API shutting down before the daemon shuts down (therefore some cleanup not completing) and not being up yet when the daemon starts (therefore attempting to create the object, and then discovering it already exists)

Reading this from my phone, but I'll give it a try and see if I can reproduce with your steps

Thanks for reporting!

thaJeztah commented 5 years ago

Confirmed that I see the same errors/warnings.

I'd have to check with the SwarmKit team to see if these are actual issues, or just a red herring.

Nov 12 20:29:19 centos-test dockerd: time="2018-11-12T20:29:19.412163426Z" level=error msg="error reading the kernel parameter net.ipv4.vs.expire_nodest_conn" error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or directory"
Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.619515003Z" level=error msg="agent: session failed" backoff=100ms error="context canceled" module=node/agent node.id=q3h8ro34wxykrdehisqxepicb
Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.619715427Z" level=error msg="failed to remove node" error="rpc error: code = Aborted desc = dispatcher is stopped" method="(*Dispatcher).Session" node.id=q3h8ro34wxykrdehisqxepicb node.session=gg89p0ragegtuby78jqpyf51r
Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.620426607Z" level=error msg="failed to receive changes from store watch API" error="rpc error: code = Unknown desc = context canceled"
Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.620700335Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {/var/run/docker/swarm/control.sock 0  <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /var/run/docker/swarm/control.sock: connect: no such file or directory\". Reconnecting..." module=grpc
Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.621016081Z" level=warning msg="grpc: addrConn.transportMonitor exits due to: context canceled" module=grpc
Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.750759696Z" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint abj6s1gt9dqvd15ouqnxmaqdu 1273fcc5aa566ebe4410a67c8e25e27e9e5c0c9e5539d112706bb552bbe6eb9e], retrying...."
Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.254580485Z" level=error msg="error creating cluster object" error="name conflicts with an existing object" module=node node.id=q3h8ro34wxykrdehisqxepicb
Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.373213186Z" level=warning msg="Could not register builder git source: failed to find git binary: exec: \"git\": executable file not found in $PATH"
Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.491401983Z" level=error msg="error reading the kernel parameter net.ipv4.vs.expire_nodest_conn" error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or directory"

Full logs below:

``` Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.080851933Z" level=info msg="libcontainerd: started new containerd process" pid=9764 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.081273245Z" level=info msg="parsed scheme: \"unix\"" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.081285715Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.081336229Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/containerd.sock 0 }]" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.081361018Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.081417701Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42070d040, CONNECTING" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.123925846Z" level=info msg="starting containerd" revision=c4446665cb9c30056f4998ed953e6d4ff22c7c39 version=1.2.0 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.124606595Z" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.124746058Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.125105006Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.btrfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.125130817Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.129771285Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.aufs" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found.\n": exit status 1" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.129788247Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotter.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.129894930Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.130092784Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.130240226Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.zfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.130248906Z" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.130283811Z" level=warning msg="could not use snapshotter zfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.130289698Z" level=warning msg="could not use snapshotter btrfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.130295552Z" level=warning msg="could not use snapshotter aufs in metadata plugin" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found.\n": exit status 1" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.136756639Z" level=info msg="loading plugin "io.containerd.differ.v1.walking"..." type=io.containerd.differ.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.136786870Z" level=info msg="loading plugin "io.containerd.gc.v1.scheduler"..." type=io.containerd.gc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137313764Z" level=info msg="loading plugin "io.containerd.service.v1.containers-service"..." type=io.containerd.service.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137338671Z" level=info msg="loading plugin "io.containerd.service.v1.content-service"..." type=io.containerd.service.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137350070Z" level=info msg="loading plugin "io.containerd.service.v1.diff-service"..." type=io.containerd.service.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137363883Z" level=info msg="loading plugin "io.containerd.service.v1.images-service"..." type=io.containerd.service.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137378915Z" level=info msg="loading plugin "io.containerd.service.v1.leases-service"..." type=io.containerd.service.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137392834Z" level=info msg="loading plugin "io.containerd.service.v1.namespaces-service"..." type=io.containerd.service.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137412339Z" level=info msg="loading plugin "io.containerd.service.v1.snapshots-service"..." type=io.containerd.service.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137424818Z" level=info msg="loading plugin "io.containerd.runtime.v1.linux"..." type=io.containerd.runtime.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137620955Z" level=info msg="loading plugin "io.containerd.runtime.v2.task"..." type=io.containerd.runtime.v2 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.137724985Z" level=info msg="loading plugin "io.containerd.monitor.v1.cgroups"..." type=io.containerd.monitor.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138123698Z" level=info msg="loading plugin "io.containerd.service.v1.tasks-service"..." type=io.containerd.service.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138143596Z" level=info msg="loading plugin "io.containerd.internal.v1.restart"..." type=io.containerd.internal.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138196384Z" level=info msg="loading plugin "io.containerd.grpc.v1.containers"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138208787Z" level=info msg="loading plugin "io.containerd.grpc.v1.content"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138219251Z" level=info msg="loading plugin "io.containerd.grpc.v1.diff"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138228895Z" level=info msg="loading plugin "io.containerd.grpc.v1.events"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138240281Z" level=info msg="loading plugin "io.containerd.grpc.v1.healthcheck"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138250491Z" level=info msg="loading plugin "io.containerd.grpc.v1.images"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138259926Z" level=info msg="loading plugin "io.containerd.grpc.v1.leases"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138269506Z" level=info msg="loading plugin "io.containerd.grpc.v1.namespaces"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138278784Z" level=info msg="loading plugin "io.containerd.internal.v1.opt"..." type=io.containerd.internal.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138305687Z" level=info msg="loading plugin "io.containerd.grpc.v1.snapshots"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138319289Z" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138329113Z" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138338632Z" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." type=io.containerd.grpc.v1 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138500562Z" level=info msg=serving... address="/var/run/docker/containerd/containerd-debug.sock" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138545887Z" level=info msg=serving... address="/var/run/docker/containerd/containerd.sock" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.138555424Z" level=info msg="containerd successfully booted in 0.015747s" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.140615179Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42070d040, READY" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153589480Z" level=info msg="parsed scheme: \"unix\"" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153605490Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153694446Z" level=info msg="parsed scheme: \"unix\"" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153701448Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153694408Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/containerd.sock 0 }]" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153714395Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153746991Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42070d2b0, CONNECTING" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153864196Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42070d2b0, READY" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153907107Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/containerd.sock 0 }]" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153916315Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.153933991Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42070d590, CONNECTING" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.154027225Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42070d590, READY" module=grpc Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.188711378Z" level=info msg="Graph migration to content-addressability took 0.00 seconds" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.189581020Z" level=info msg="Loading containers: start." Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.343610605Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.397317940Z" level=info msg="Loading containers: done." Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.420308255Z" level=info msg="Docker daemon" commit=4d60db4 graphdriver(s)=overlay2 version=18.09.0 Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.420503507Z" level=info msg="Daemon has completed initialization" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.434554358Z" level=warning msg="Could not register builder git source: failed to find git binary: exec: \"git\": executable file not found in $PATH" Nov 12 20:28:34 centos-test dockerd: time="2018-11-12T20:28:34.443203696Z" level=info msg="API listen on /var/run/docker.sock" Nov 12 20:28:58 centos-test dockerd: time="2018-11-12T20:28:58.965358192Z" level=error msg="Error initializing swarm: could not choose an IP address to advertise since this system has multiple addresses on interface eth0 (142.93.139.68 and 10.18.0.10)" Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.607754106Z" level=info msg="parsed scheme: \"\"" module=grpc Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.607907134Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.608054184Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{/var/run/docker/swarm/control.sock 0 }]" module=grpc Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.608089507Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.608244869Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4201f2410, CONNECTING" module=grpc Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.611830676Z" level=info msg="Listening for connections" addr="[::]:2377" module=node node.id=q3h8ro34wxykrdehisqxepicb proto=tcp Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.611931139Z" level=info msg="Listening for local connections" addr=/var/run/docker/swarm/control.sock module=node node.id=q3h8ro34wxykrdehisqxepicb proto=unix Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.613287229Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4201f2410, READY" module=grpc Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.629587184Z" level=info msg="42d4f3e764b86479 became follower at term 0" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.629698193Z" level=info msg="newRaft 42d4f3e764b86479 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.629728578Z" level=info msg="42d4f3e764b86479 became follower at term 1" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.631492078Z" level=info msg="42d4f3e764b86479 is starting a new election at term 1" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.631680286Z" level=info msg="42d4f3e764b86479 became candidate at term 2" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.631758365Z" level=info msg="42d4f3e764b86479 received MsgVoteResp from 42d4f3e764b86479 at term 2" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.631793207Z" level=info msg="42d4f3e764b86479 became leader at term 2" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.631831931Z" level=info msg="raft.node: 42d4f3e764b86479 elected leader 42d4f3e764b86479 at term 2" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.632905177Z" level=info msg="Creating default ingress network" module=node node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.634991223Z" level=info msg="leadership changed from not yet part of a raft cluster to q3h8ro34wxykrdehisqxepicb" module=node node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:18 centos-test dockerd: time="2018-11-12T20:29:18.635160411Z" level=info msg="dispatcher starting" module=dispatcher node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:19 centos-test dockerd: time="2018-11-12T20:29:19.145090725Z" level=info msg="manager selected by agent for new session: { }" module=node/agent node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:19 centos-test dockerd: time="2018-11-12T20:29:19.145224547Z" level=info msg="waiting 0s before registering session" module=node/agent node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:19 centos-test dockerd: time="2018-11-12T20:29:19.242573304Z" level=info msg="worker q3h8ro34wxykrdehisqxepicb was successfully registered" method="(*Dispatcher).register" Nov 12 20:29:19 centos-test dockerd: time="2018-11-12T20:29:19.249005260Z" level=info msg="Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=142.93.139.68 Adv-addr=142.93.139.68 Data-addr= Remote-addr-list=[] MTU=1500" Nov 12 20:29:19 centos-test dockerd: time="2018-11-12T20:29:19.249374982Z" level=info msg="New memberlist node - Node:centos-test will use memberlist nodeID:85ff8c0b1b59 with config:&{NodeID:85ff8c0b1b59 Hostname:centos-test BindAddr:0.0.0.0 AdvertiseAddr:142.93.139.68 BindPort:0 Keys:[[24 66 133 216 153 72 232 248 159 220 152 159 23 36 62 205] [160 65 152 222 181 3 57 12 251 102 145 211 49 2 227 67] [118 144 125 99 54 156 130 33 142 134 30 84 179 222 229 103]] PacketBufferSize:1400 reapEntryInterval:1800000000000 reapNetworkInterval:1825000000000 StatsPrintPeriod:5m0s HealthPrintPeriod:1m0s}" Nov 12 20:29:19 centos-test dockerd: time="2018-11-12T20:29:19.252855548Z" level=info msg="Node 85ff8c0b1b59/142.93.139.68, joined gossip cluster" Nov 12 20:29:19 centos-test dockerd: time="2018-11-12T20:29:19.253012931Z" level=info msg="Node 85ff8c0b1b59/142.93.139.68, added to nodes list" Nov 12 20:29:19 centos-test dockerd: time="2018-11-12T20:29:19.412163426Z" level=error msg="error reading the kernel parameter net.ipv4.vs.expire_nodest_conn" error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or directory" Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.616249044Z" level=info msg="Processing signal 'terminated'" Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.619322873Z" level=info msg="Stopping manager" module=node node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.619390009Z" level=info msg="dispatcher stopping" method="(*Dispatcher).Stop" module=dispatcher node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.619466671Z" level=info msg="shutting down certificate renewal routine" module=node/tls node.id=q3h8ro34wxykrdehisqxepicb node.role=swarm-manager Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.619515003Z" level=error msg="agent: session failed" backoff=100ms error="context canceled" module=node/agent node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.619688440Z" level=info msg="dispatcher session dropped, marking node q3h8ro34wxykrdehisqxepicb down" method="(*Dispatcher).Session" node.id=q3h8ro34wxykrdehisqxepicb node.session=gg89p0ragegtuby78jqpyf51r Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.619715427Z" level=error msg="failed to remove node" error="rpc error: code = Aborted desc = dispatcher is stopped" method="(*Dispatcher).Session" node.id=q3h8ro34wxykrdehisqxepicb node.session=gg89p0ragegtuby78jqpyf51r Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.620426607Z" level=error msg="failed to receive changes from store watch API" error="rpc error: code = Unknown desc = context canceled" Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.620700335Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {/var/run/docker/swarm/control.sock 0 }. Err :connection error: desc = \"transport: Error while dialing dial unix /var/run/docker/swarm/control.sock: connect: no such file or directory\". Reconnecting..." module=grpc Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.620793607Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4201f2410, TRANSIENT_FAILURE" module=grpc Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.620813689Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4201f2410, CONNECTING" module=grpc Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.620828057Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4201f2410, TRANSIENT_FAILURE" module=grpc Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.620901288Z" level=info msg="Manager shut down" module=node node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.621016081Z" level=warning msg="grpc: addrConn.transportMonitor exits due to: context canceled" module=grpc Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.622094435Z" level=info msg="Node 85ff8c0b1b59/142.93.139.68, left gossip cluster" Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.622128508Z" level=info msg="Node 85ff8c0b1b59 change state NodeActive --> NodeFailed" Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.622149723Z" level=info msg="Node 85ff8c0b1b59/142.93.139.68, added to failed nodes list" Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.750759696Z" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint abj6s1gt9dqvd15ouqnxmaqdu 1273fcc5aa566ebe4410a67c8e25e27e9e5c0c9e5539d112706bb552bbe6eb9e], retrying...." Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.765347837Z" level=info msg="stopping event stream following graceful shutdown" error="" module=libcontainerd namespace=moby Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.766006959Z" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.766050883Z" level=info msg="stopping healthcheck following graceful shutdown" module=libcontainerd Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.767924849Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42070d590, TRANSIENT_FAILURE" module=grpc Nov 12 20:29:27 centos-test dockerd: time="2018-11-12T20:29:27.767982296Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42070d590, CONNECTING" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.950951126Z" level=info msg="parsed scheme: \"unix\"" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.951189351Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.951333359Z" level=info msg="parsed scheme: \"unix\"" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.951348778Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.951377848Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///run/containerd/containerd.sock 0 }]" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.951407428Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.951467604Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42015b5d0, CONNECTING" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.952046134Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42015b5d0, READY" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.952111355Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///run/containerd/containerd.sock 0 }]" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.952121301Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.952146441Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42015b8c0, CONNECTING" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.952234885Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42015b8c0, READY" module=grpc Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.965177614Z" level=info msg="[graphdriver] using prior storage driver: overlay2" Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.968230533Z" level=info msg="Graph migration to content-addressability took 0.00 seconds" Nov 12 20:29:31 centos-test dockerd: time="2018-11-12T20:29:31.969232854Z" level=info msg="Loading containers: start." Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.134700187Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address" Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.178632995Z" level=info msg="Loading containers: done." Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.209946229Z" level=info msg="Docker daemon" commit=4d60db4 graphdriver(s)=overlay2 version=18.09.0 Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.229191461Z" level=info msg="parsed scheme: \"\"" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.229249711Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.232538579Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{/var/run/docker/swarm/control.sock 0 }]" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.232570829Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.233257916Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420b114f0, CONNECTING" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.242922918Z" level=info msg="42d4f3e764b86479 became follower at term 2" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.242980034Z" level=info msg="newRaft 42d4f3e764b86479 [peers: [], term: 2, commit: 11, applied: 0, lastindex: 11, lastterm: 2]" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.244845624Z" level=info msg="Listening for connections" addr="[::]:2377" module=node node.id=q3h8ro34wxykrdehisqxepicb proto=tcp Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.244900672Z" level=info msg="Listening for local connections" addr=/var/run/docker/swarm/control.sock module=node node.id=q3h8ro34wxykrdehisqxepicb proto=unix Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.248592234Z" level=info msg="parsed scheme: \"\"" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.248613555Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.248732796Z" level=info msg="manager selected by agent for new session: {q3h8ro34wxykrdehisqxepicb 142.93.139.68:2377}" module=node/agent node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.250062792Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420b114f0, READY" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.251769553Z" level=info msg="42d4f3e764b86479 is starting a new election at term 2" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.251873245Z" level=info msg="42d4f3e764b86479 became candidate at term 3" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.251920818Z" level=info msg="42d4f3e764b86479 received MsgVoteResp from 42d4f3e764b86479 at term 3" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.251954289Z" level=info msg="42d4f3e764b86479 became leader at term 3" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.251979148Z" level=info msg="raft.node: 42d4f3e764b86479 elected leader 42d4f3e764b86479 at term 3" module=raft node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.251885432Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{142.93.139.68:2377 0 }]" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.252023150Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.252093743Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42015be60, CONNECTING" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.253242947Z" level=info msg="waiting 0s before registering session" module=node/agent node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.254580485Z" level=error msg="error creating cluster object" error="name conflicts with an existing object" module=node node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.254678284Z" level=info msg="leadership changed from no cluster leader to q3h8ro34wxykrdehisqxepicb" module=node node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.254774915Z" level=info msg="dispatcher starting" module=dispatcher node.id=q3h8ro34wxykrdehisqxepicb Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.258095821Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42015be60, READY" module=grpc Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.362643291Z" level=info msg="worker q3h8ro34wxykrdehisqxepicb was successfully registered" method="(*Dispatcher).register" Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.366266123Z" level=info msg="Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=142.93.139.68 Adv-addr=142.93.139.68 Data-addr= Remote-addr-list=[] MTU=1500" Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.366431637Z" level=info msg="New memberlist node - Node:centos-test will use memberlist nodeID:f059da8ebefd with config:&{NodeID:f059da8ebefd Hostname:centos-test BindAddr:0.0.0.0 AdvertiseAddr:142.93.139.68 BindPort:0 Keys:[[24 66 133 216 153 72 232 248 159 220 152 159 23 36 62 205] [160 65 152 222 181 3 57 12 251 102 145 211 49 2 227 67] [118 144 125 99 54 156 130 33 142 134 30 84 179 222 229 103]] PacketBufferSize:1400 reapEntryInterval:1800000000000 reapNetworkInterval:1825000000000 StatsPrintPeriod:5m0s HealthPrintPeriod:1m0s}" Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.367299231Z" level=info msg="Daemon has completed initialization" Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.371804911Z" level=info msg="Node f059da8ebefd/142.93.139.68, joined gossip cluster" Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.371890624Z" level=info msg="Node f059da8ebefd/142.93.139.68, added to nodes list" Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.373213186Z" level=warning msg="Could not register builder git source: failed to find git binary: exec: \"git\": executable file not found in $PATH" Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.381305477Z" level=info msg="API listen on /var/run/docker.sock" Nov 12 20:29:32 centos-test dockerd: time="2018-11-12T20:29:32.491401983Z" level=error msg="error reading the kernel parameter net.ipv4.vs.expire_nodest_conn" error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or directory" ```
dperny commented 5 years ago

i have a strong suspicion it's just garbage messages (swarmkit is notoriously bad about them) but i'll go code-diving to check.

dperny commented 5 years ago

Yeah, it's garbage. The line that produces it is here:

https://github.com/docker/swarmkit/blob/0503e17893a2ceafc43782b5c0668f2748fdea86/manager/manager.go#L967-L969

We're not supposed to emit that error, but something got changed at some point, because the return value of CreateCluster is actually ErrNameConflict.

https://github.com/docker/swarmkit/blob/0503e17893a2ceafc43782b5c0668f2748fdea86/manager/state/store/clusters.go#L70-L79

I'll open a quick PR, it's just a one-line fix. Sorry about the noise.

dperny commented 5 years ago

Also, I'm 94% sure that failed to remove node is an expected part of the swarmkit shutdown process, and I seem to recall that fixing it and errors of its ilk is nontrivial.

Macbeth-byx commented 4 weeks ago

Hello, Mr My enviroment are as flows: 18.09.0 Version of docker, three nodes There are some question about docker swarm.The docker was exit after executed "docker swarm leave -f" cmd at 0001 node ,detail log are as follows: Jun 18 16:50:12 moss-0001 dockerd[4047]: time="2024-06-18T16:50:12.646486713+08:00" level=error msg="error sending message to peer" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 172.78.78.10:2377: connect: connection refused"" Jun 18 16:50:13 moss-0001 dockerd[4047]: time="2024-06-18T16:50:13.586876613+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc001896010, CONNECTING" module=grpc Jun 18 16:50:13 moss-0001 dockerd[4047]: time="2024-06-18T16:50:13.587991196+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc001896010, READY" module=grpc Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084206553+08:00" level=info msg="shutting down certificate renewal routine" module=node/tls node.id=n09yo5hghhwxuq1ub26spp1at node.role=swarm-manager Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084208854+08:00" level=info msg="Stopping manager" module=node node.id=n09yo5hghhwxuq1ub26spp1at Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084273668+08:00" level=info msg="dispatcher stopping" method="(Dispatcher).Stop" module=dispatcher node.id=n09yo5hghhwxuq1ub26spp1at Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084317588+08:00" level=info msg="dispatcher session dropped, marking node q0keq55q291o8trumxbob2lhg down" method="(Dispatcher).Session" node.id=q0keq55q291o8trumxbob2lhg node.session=cqkmb6ad2cc91yjkxv4qofg6u Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084328711+08:00" level=error msg="failed to remove node" error="rpc error: code = Aborted desc = dispatcher is stopped" method="(Dispatcher).Session" node.id=q0keq55q291o8trumxbob2lhg node.session=cqkmb6ad2cc91yjkxv4qofg6u Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084343918+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc000d15500, TRANSIENT_FAILURE" module=grpc Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084368139+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc000d15500, CONNECTING" module=grpc Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084356930+08:00" level=info msg="dispatcher session dropped, marking node m7p80vzt9krzz9o9tia7972a4 down" method="(Dispatcher).Session" node.id=m7p80vzt9krzz9o9tia7972a4 node.session=a6ch3h812r23c5wahjw12e4l0 Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084393572+08:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {/var/run/docker/swarm/control.sock 0 }. Err :connection error: desc = "transport: Error while dialing dial unix /var/run/docker/swarm/control.sock: connect: no such file or directory". Reconnecting..." module=grpc Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084389703+08:00" level=error msg="failed to remove node" error="rpc error: code = Aborted desc = dispatcher is stopped" method="(Dispatcher).Session" node.id=m7p80vzt9krzz9o9tia7972a4 node.session=a6ch3h812r23c5wahjw12e4l0 Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084423587+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc000d15500, TRANSIENT_FAILURE" module=grpc Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.084614208+08:00" level=error msg="failed to receive changes from store watch API" error="rpc error: code = Unknown desc = context canceled" Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.085089050+08:00" level=info msg="Manager shut down" module=node node.id=n09yo5hghhwxuq1ub26spp1at Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.085142553+08:00" level=warning msg="grpc: addrConn.transportMonitor exits due to: context canceled" module=grpc Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.361178552+08:00" level=info msg="Node 9041e40f9e98 change state NodeActive --> NodeLeft" Jun 18 16:54:04 moss-0001 dockerd[4047]: time="2024-06-18T16:54:04.361226540+08:00" level=info msg="moss-0001(9041e40f9e98): Node leave event for 9041e40f9e98/172.78.78.8" Jun 18 16:54:04 moss-0001 dockerd[4047]: panic: runtime error: invalid memory address or nil pointer dereference Jun 18 16:54:04 moss-0001 dockerd[4047]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x56408cb68044] Jun 18 16:54:04 moss-0001 dockerd[4047]: goroutine 828 [running]: Jun 18 16:54:04 moss-0001 dockerd[4047]: panic({0x56408d8cf5c0, 0x56408ea6b160}) Jun 18 16:54:04 moss-0001 dockerd[4047]: /usr/lib/golang/src/runtime/panic.go:1147 +0x3a8 fp=0xc0021d5d60 sp=0xc0021d5ca0 pc=0x56408bfeb208 Jun 18 16:54:04 moss-0001 dockerd[4047]: runtime.panicmem(...) Jun 18 16:54:04 moss-0001 dockerd[4047]: /usr/lib/golang/src/runtime/panic.go:221 Jun 18 16:54:04 moss-0001 dockerd[4047]: runtime.sigpanic() Jun 18 16:54:04 moss-0001 dockerd[4047]: /usr/lib/golang/src/runtime/signal_unix.go:735 +0x327 fp=0xc0021d5db0 sp=0xc0021d5d60 pc=0x56408c0020e7 Jun 18 16:54:04 moss-0001 dockerd[4047]: github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb.(NetworkDB).rejoinClusterBootStrap(0xc002116a20) Jun 18 16:54:04 moss-0001 dockerd[4047]: /root/rpmbuild/BUILD/components/engine/.gopath/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb/cluster.go:305 +0x4c4 fp=0xc0021d5f00 sp=0xc0021d5db0 pc=0x56408cb68044 Jun 18 16:54:04 moss-0001 dockerd[4047]: github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb.(NetworkDB).rejoinClusterBootStrap-fm() Jun 18 16:54:04 moss-0001 dockerd[4047]: /root/rpmbuild/BUILD/components/engine/.gopath/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb/cluster.go:284 +0x26 fp=0xc0021d5f18 sp=0xc0021d5f00 pc=0x56408cb8bb46 Jun 18 16:54:04 moss-0001 dockerd[4047]: github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb.(NetworkDB).triggerFunc(0xc002116a20, 0xdf8475800, 0xc001513140, 0xc001281460) Jun 18 16:54:04 moss-0001 dockerd[4047]: /root/rpmbuild/BUILD/components/engine/.gopath/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb/cluster.go:256 +0x112 fp=0xc0021d5fb0 sp=0xc0021d5f18 pc=0x56408cb67832 Jun 18 16:54:04 moss-0001 dockerd[4047]: github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb.(NetworkDB).clusterInit·dwrap·4() Jun 18 16:54:04 moss-0001 dockerd[4047]: /root/rpmbuild/BUILD/components/engine/.gopath/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb/cluster.go:178 +0x32 fp=0xc0021d5fe0 sp=0xc0021d5fb0 pc=0x56408cb67012 Jun 18 16:54:04 moss-0001 dockerd[4047]: runtime.goexit() Jun 18 16:54:04 moss-0001 dockerd[4047]: /usr/lib/golang/src/runtime/asm_amd64.s:1581 +0x1 fp=0xc0021d5fe8 sp=0xc0021d5fe0 pc=0x56408c01faa1 Jun 18 16:54:04 moss-0001 dockerd[4047]: created by github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb.(NetworkDB).clusterInit Jun 18 16:54:04 moss-0001 dockerd[4047]: /root/rpmbuild/BUILD/components/engine/.gopath/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/networkdb/cluster.go:178 +0x7e5 Jun 18 16:54:04 moss-0001 dockerd[4047]: goroutine 1 [chan receive, 5 minutes]: Jun 18 16:54:04 moss-0001 dockerd[4047]: runtime.gopark(0xc000f21918, 0x56408bfc1c14, 0x18, 0x0, 0x7fc622c6bd08) Jun 18 16:54:04 moss-0001 dockerd[4047]: /usr/lib/golang/src/runtime/proc.go:366 +0xd6 fp=0xc0022af898 sp=0xc0022af878 pc=0x56408bfee136 Jun 18 16:54:04 moss-0001 dockerd[4047]: runtime.chanrecv(0xc000b9a000, 0xc000f21bf8, 0x1) Jun 18 16:54:04 moss-0001 dockerd[4047]: /usr/lib/golang/src/runtime/chan.go:576 +0x56c fp=0xc0022af928 sp=0xc0022af898 pc=0x56408bfbaf2c Jun 18 16:54:04 moss-0001 dockerd[4047]: runtime.chanrecv1(0xc000b33740, 0xc001481a10) Jun 18 16:54:04 moss-0001 dockerd[4047]: /usr/lib/golang/src/runtime/chan.go:439 +0x18 fp=0xc0022af950 sp=0xc0022af928 pc=0x56408bfba958 Jun 18 16:54:04 moss-0001 dockerd[4047]: main.(DaemonCli).start(0xc000b33740, 0xc0002a3da0) Jun 18 16:54:04 moss-0001 dockerd[4047]: src/github.com/docker/docker/cmd/dockerd/daemon.go:338 +0x148b fp=0xc0022afd78 sp=0xc0022af950 pc=0x56408d3f416b Jun 18 16:54:04 moss-0001 dockerd[4047]: main.runDaemon(...) Jun 18 16:54:04 moss-0001 dockerd[4047]: src/github.com/docker/docker/cmd/dockerd/docker_unix.go:7 Jun 18 16:54:04 moss-0001 dockerd[4047]: main.newDaemonCommand.func1(0xc000a9fb80, {0xc000293e00, 0x8, 0x8}) Jun 18 16:54:04 moss-0001 dockerd[4047]: src/github.com/docker/docker/cmd/dockerd/docker.go:29 +0x5c fp=0xc0022afda0 sp=0xc0022afd78 pc=0x56408d3f8dfc Jun 18 16:54:04 moss-0001 dockerd[4047]: github.com/docker/docker/vendor/github.com/spf13/cobra.(Command).execute(0xc000a9fb80, {0xc00013c010, 0x8, 0x8}) Jun 18 16:54:04 moss-0001 dockerd[4047]: /root/rpmbuild/BUILD/components/engine/.gopath/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:762 +0x60e fp=0xc0022afe60 sp=0xc0022afda0 pc=0x56408d3e93ce Jun 18 16:54:04 moss-0001 dockerd[4047]: github.com/docker/docker/vendor/github.com/spf13/cobra.(Command).ExecuteC(0xc000a9fb80) Jun 18 16:54:04 moss-0001 dockerd[4047]: /root/rpmbuild/BUILD/components/engine/.gopath/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:852 +0x2dc fp=0xc0022aff18 sp=0xc0022afe60 pc=0x56408d3e98dc Jun 18 16:54:04 moss-0001 dockerd[4047]: github.com/docker/docker/vendor/github.com/spf13/cobra.(Command).Execute(...) Jun 18 16:54:04 moss-0001 dockerd[4047]: /root/rpmbuild/BUILD/components/engine/.gopath/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:800 Jun 18 16:54:04 moss-0001 dockerd[4047]: main.main() Jun 18 16:54:04 moss-0001 dockerd[4047]: src/github.com/docker/docker/cmd/dockerd/docker.go:70 +0xca fp=0xc0022aff80 sp=0xc0022aff18 pc=0x56408d3f8f0a ....... Jun 18 16:54:12 moss-0001 systemd[1]: docker.service: Main process exited, code=killed, status=6/ABRT Jun 18 16:54:12 moss-0001 systemd[1]: docker.service: Failed with result 'signal'. Jun 18 16:54:12 moss-0001 systemd[1]: docker.service: Unit process 4070 (containerd) remains running after unit stopped. Jun 18 16:54:12 moss-0001 systemd[1]: docker.service: Unit process 4401 (containerd-shim) remains running after unit stopped. Jun 18 16:54:12 moss-0001 systemd[1]: docker.service: Unit process 4466 (containerd-shim) remains running after unit stopped.

at the moment the Linux system message log file print: error during connect: Post "http://%2Fvar%2Frun8%2Fdocker.sock/v1.39/swarm/leave?force=1": EOF