moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.66k stars 18.65k forks source link

restarting service makes it stuck [swarm mode] #25536

Open rubycut opened 8 years ago

rubycut commented 8 years ago

Output of docker version:

Client:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 22:00:36 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 22:00:36 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 55                                                                                                                                     [16/9457]
 Running: 3
 Paused: 0
 Stopped: 52
Images: 360
Server Version: 1.12.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 465
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: overlay bridge host null
Swarm: active
 NodeID: ec05f8df5k21l3hzur5e97hoi
 Is Manager: true
 ClusterID: 8q
 Managers: 1
 Nodes: 2
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot interval: 10000
  Heartbeat tick: 1
  Election tick: 3
 Dispatcher:
  Heartbeat period: 5 seconds
 CA configuration:
  Expiry duration: 3 months
 Node Address: xxx
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-86-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 5.82 GiB
Name: xxx1133
ID: CNIF:MIYJ:HJYM:PWBX:UWKE:2GVR:QDBO:VDPG:4633:YOVV:P23I:IXCS
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
 docker-staging.sysmon.xxx.net
 docker.xxx.net
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):

Physical

Steps to reproduce the issue:

  1. docker service rm web ; docker service create --name web --replicas 3 --publish 3000:3100 --restart-condition any ...

Describe the results you received:

Service is stuck, port is 3000 is no longer working.

Describe the results you expected:

Everything should work as before restart.

Additional information you deem important (e.g. issue happens only occasionally):

If you put 10 seconds pause between docker service rm and docker service create then everything works as expected.

thaJeztah commented 8 years ago

Thanks for reporting! I guess this may be partly due because the docker service command are asynchronous, and can return before the actual action is completed.

Does the service eventually work after running those commands?

Also, but just out of interest, wondering why you're removing the service, instead of updating the existing service; doing so would take advantage of the swarm mode scheduler (e.g. taking rolling updates into account)

rubycut commented 8 years ago

@thaJeztah , service never starts working unless you add pause in between.

As far as updating existing service, we have fully automated deployment system which injects environment variables into the image. These variables might change between deploys, and therefore we start new service every time.

sheerun commented 8 years ago

@thaJeztah Could docker queue service commands and run them one after another by default?