moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.86k stars 18.67k forks source link

Docker can hang indefinitely waiting for a nonexistant process to pull an image. #12823

Closed tfoote closed 4 years ago

tfoote commented 9 years ago

Running docker pull will simply hang waiting for a non-existant process to download the repository.

root@ip-172-31-18-106:~# docker pull ubuntu:trusty
Repository ubuntu already being pulled by another client. Waiting.

This is the same behavior as #3115 however there is no other docker process running.

The list of running docker containers:

# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

See here for a full process tree: https://gist.github.com/tfoote/c8a30e569c911f1977e2

When this happens my process monitor fails the job after 120 minutes, which happens regularly.

An strace of the docker instance can be found here: https://gist.github.com/tfoote/1dc3905eb9c235cb5c53

it is stuck on an epoll_wait call.

Here's all the standard info.

root@ip-172-31-18-106:~# docker version
Client version: 1.5.0
Client API version: 1.17
Go version (client): go1.4.1
Git commit (client): a8a31ef
OS/Arch (client): linux/amd64
Server version: 1.5.0
Server API version: 1.17
Go version (server): go1.4.1
Git commit (server): a8a31ef

root@ip-172-31-18-106:~# docker -D info
Containers: 132
Images: 6667
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 6953
Execution Driver: native-0.2
Kernel Version: 3.13.0-44-generic
Operating System: Ubuntu 14.04.1 LTS
CPUs: 4
Total Memory: 14.69 GiB
Name: ip-172-31-18-106
ID: SZWS:VD6O:CLP2:WRAM:KWIL:47HZ:HOEY:SR6R:ZOWR:E3HG:PS7P:TCZP
Debug mode (server): false
Debug mode (client): true
Fds: 27
Goroutines: 32
EventsListeners: 0
Init Path: /usr/bin/docker
Docker Root Dir: /var/lib/docker
WARNING: No swap limit support

root@ip-172-31-18-106:~# uname -a
Linux ip-172-31-18-106 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

It's running on AWS.

I'm running an instance of the ROS buildfarm which can reproduce this bad state once every couple days when fully loaded running debian package builds at ~ 100% cpu load. This happens when we are preparing a major release.

I have not been able to isolate the cause in a smaller example, it has happened on multiple different repositories. Sometimes it's the official Ubuntu repository, sometimes it's our own custom repositories. We've tracked a few instances of this error recently here. When one repository is failing to pull, others work fine. All the repositories are hosted on the public docker hub.

Here's an example of one hanging while another passes.

root@ip-172-31-18-106:~# docker pull ubuntu:saucy
Pulling repository ubuntu
^Croot@ip-172-31-18-106:~# docker pull ubuntu:saucy^C
root@ip-172-31-18-106:~# docker pull osrf/ubuntu_32bit
Pulling repository osrf/ubuntu_32bit
FATA[0000] Tag latest not found in repository osrf/ubuntu_32bit 
root@ip-172-31-18-106:~# docker pull osrf/ubuntu_32bit:saucy
Pulling repository osrf/ubuntu_32bit
d6a6e4bd19d5: Download complete 
Status: Image is up to date for osrf/ubuntu_32bit:saucy

As determined in #3115 this can be fixed by restarting docker. However from that issue it is expected that this issue should not happen anymore. I think there has been a regression or we've found another edge case.

I will keep the machine online for a few days if anyone has suggestions on what I can run to debug the isse. Otherwise I'll have to wait for it to reoccur to be able to test any debugging.

tfoote commented 9 years ago

I've caught 2 more instances of this issue. One I was able to catch the first process which was actually trying to pull the image, which terminated and did not clean up before it timed out.

Here's the console output of me finding the processes and attaching strace. https://gist.github.com/tfoote/807399caba8fd1d8601c The repeated iterations within the stack trace were about a minute apart, I didn't time the precisely. I stopped the strace briefly to try to attach gdb but that did not give useful information.

From the console output of that job you can see that this one was actually pulling the docker image. https://gist.github.com/tfoote/700b74935967078213ed

All subsequent jobs hang waiting for this job to finish despite the fact that it's been terminated.

From the command line I can confirm it's repeatably hung after the above process was terminated.

root@ip-172-31-18-10:~# docker pull ubuntu:vivid
Repository ubuntu already being pulled by another client. Waiting.
^Croot@ip-172-31-18-10:~# 

Though I did find that it's specific to the image, this failed on vivid and trusty still can be pulled.

root@ip-172-31-18-10:~# docker pull ubuntu:trusty
Pulling repository ubuntu
b7cf8f0d9e82: Download complete 
706766fe1019: Download complete 
a62a42e77c9c: Download complete 
2c014f14d3d9: Download complete 
Status: Image is up to date for ubuntu:trusty
tfoote commented 9 years ago

This is happening about once every 6-8 hours across 9 machines.

I caught the failing job again. It's interesting in that the first pull to fail and cause the lockup actually got some way through the pull then hung.

13:29:23 Check docker base image for updates: docker pull ubuntu:trusty
13:29:26 Pulling repository ubuntu
13:29:50 07f8e8c5e660: Pulling image (trusty) from ubuntu
13:29:50 07f8e8c5e660: Pulling image (trusty) from ubuntu, endpoint: https://registry-1.docker.io/v1/
13:29:53 07f8e8c5e660: Pulling dependent layers
13:30:05 e9e06b06e14c: Download complete
13:30:05 a82efea989f9: Download complete
13:30:05 37bea4ee0c81: Download complete
13:30:05 07f8e8c5e660: Download complete
15:29:17 Build timed out (after 120 minutes). Marking the build as failed.

The process tree straces and console outputs of each of the 4 docker pull processes can be found at: https://gist.github.com/tfoote/e583ee47ef21007fb465

rohansingh commented 9 years ago

We have been seeing a problem with identical symptoms. Most failed builds of Helios have tests that fail due to this issue.

tfoote commented 9 years ago

I had originally thought that this was a load based issue, however one of my machines with a very low load average has also hit this problem. process tree, strace, and docker info dump from that machine.

LK4D4 commented 9 years ago

Looks like pull just hanging :/

tfoote commented 9 years ago

I upgraded to 1.6 and thought that this was resolved as it's been a week or two of heavy building. However it just reappeared running Docker 1.6.

Excerpted from the build log here I the timestamps are running time. I canceled this job at 12 hours of running. Following builds trying to use that image now block waiting for another client until I restart docker.

# BEGIN SECTION: docker version
00:00:00.644 + docker version
00:00:00.663 Client version: 1.6.0
00:00:00.663 Client API version: 1.18
00:00:00.663 Go version (client): go1.4.2
00:00:00.663 Git commit (client): 4749651
00:00:00.663 OS/Arch (client): linux/amd64
00:00:00.664 Server version: 1.6.0
00:00:00.664 Server API version: 1.18
00:00:00.664 Go version (server): go1.4.2
00:00:00.664 Git commit (server): 4749651
00:00:00.664 OS/Arch (server): linux/amd64
00:00:00.664 + echo # END SECTION
00:00:00.664 # END SECTION
00:00:00.664 + echo # BEGIN SECTION: docker info
00:00:00.664 
...
# BEGIN SECTION: docker info
00:00:00.664 + docker info
00:00:01.252 Containers: 226
00:00:01.252 Images: 3971
00:00:01.252 Storage Driver: aufs
00:00:01.252  Root Dir: /var/lib/docker/aufs
00:00:01.252  Backing Filesystem: extfs
00:00:01.252  Dirs: 4423
00:00:01.252  Dirperm1 Supported: false
00:00:01.252 Execution Driver: native-0.2
00:00:01.252 Kernel Version: 3.13.0-36-generic
00:00:01.252 Operating System: Ubuntu 14.04.2 LTS
00:00:01.253 CPUs: 4
00:00:01.253 Total Memory: 14.69 GiB
00:00:01.253 Name: ip-172-31-5-54
00:00:01.253 ID: CTP4:VRZL:44X2:7A2V:GACU:ON3Y:CLPZ:K5EH:ZSS6:JRJL:SDWN:N3TY
00:00:01.253 WARNING: No swap limit support
00:00:01.253 + echo # END SECTION
00:00:01.253 # END SECTION
00:00:01.260 [Jrel_arm_trigger-jobs] $ /bin/sh -xe /tmp/hudson3896159289535642191.sh
00:00:01.261 + echo # BEGIN SECTION: Write PGP repository keys
00:00:01.261 
...
# BEGIN SECTION: Build Dockerfile - reconfigure jobs
00:00:02.396 + cd /home/jenkins-slave/workspace/Jrel_arm_trigger-jobs/docker_trigger_jobs
00:00:02.396 + python3 -u /home/jenkins-slave/workspace/Jrel_arm_trigger-jobs/ros_buildfarm/scripts/misc/docker_pull_baseimage.py
00:00:02.420 Get base image name from Dockerfile 'Dockerfile': ubuntu:trusty
00:00:02.421 Check docker base image for updates: docker pull ubuntu:trusty
00:00:02.834 Pulling repository ubuntu
00:00:12.141 07f8e8c5e660: Pulling image (trusty) from ubuntu
00:00:12.141 07f8e8c5e660: Pulling image (trusty) from ubuntu, endpoint: https://registry-1.docker.io/v1/
00:00:12.597 07f8e8c5e660: Pulling dependent layers
00:00:12.597 e9e06b06e14c: Download complete
00:00:12.597 a82efea989f9: Download complete
00:00:12.597 37bea4ee0c81: Download complete
00:00:12.598 07f8e8c5e660: Download complete
12:35:23.694 Build was aborted
12:35:23.699 Aborted by Admin
12:35:23.701 Finished: ABORTED
xied75 commented 9 years ago

After this long not even an response from the team? It can't be only us suffering from this problem?

ruffsl commented 9 years ago

I'd like to chime in and say that I've also experienced this behavior.

smetj commented 9 years ago

Same here (restarting docker daemon solves the issue however)

thaJeztah commented 9 years ago

@jlhawn is this in some way related to https://github.com/docker/docker/pull/15353?

jlhawn commented 9 years ago

@thaJeztah It's certainly related, but #15353 addresses a separate issue. However, a side-effect of that might be that stuck download goroutines would be killed perhaps fixing this issue in some cases?

chollier commented 9 years ago

still experiencing the same with 1.8 on a newly created host via docker-machine create -d virtualbox dev

jolks commented 9 years ago

Experiencing the same problem after installing Docker Toolbox on Mac. Restarting VM via $ docker-machine restart default did not work. Problem solved by restarting the computer.

threedliteguy commented 9 years ago

Same problem today, fresh Centos 7 minimal install, fully updated, rebooted. Both hello-world and "docker pull centos" hung multiple times at "Layer already being pulled by another client". Restarting docker daemon lets it get further, only to hang again. After multiple restarts of docker it finally completed the pull.

itzg commented 9 years ago

:+1: on fresh install. NOTE: I am behind a corporate transparent proxy and I have added their SSL certs via update-ca-certificates.

dev@ubuntu:~$ docker version
Client:
 Version:      1.8.1
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   d12ea79
 Built:        Thu Aug 13 02:35:49 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.1
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   d12ea79
 Built:        Thu Aug 13 02:35:49 UTC 2015
 OS/Arch:      linux/amd64
dev@ubuntu:~$ docker info
Containers: 0
Images: 0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 0
 Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.19.0-26-generic
Operating System: Ubuntu 14.04.3 LTS
CPUs: 1
Total Memory: 1.954 GiB
Name: ubuntu
ID: 3FRG:FGBN:UCP3:BF3K:5TDF:YJ4S:V5BN:Q5C2:3EDC:JJOL:QBM3:IKEY
WARNING: No swap limit support
guyellis commented 9 years ago

I'm also experiencing this

$ docker version
Client:
 Version:      1.8.1
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   d12ea79
 Built:        Thu Aug 13 02:35:49 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.1
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   d12ea79
 Built:        Thu Aug 13 02:35:49 UTC 2015
 OS/Arch:      linux/amd64

$ docker info
Containers: 3
Images: 24
Storage Driver: aufs
 Root Dir: /media/guy/BE282B37282AEDD91/docker/aufs
 Backing Filesystem: extfs
 Dirs: 30
 Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.16.0-46-generic
Operating System: Ubuntu 14.04.3 LTS
CPUs: 8
Total Memory: 15.59 GiB
Name: MSI-GS60
ID: WYWR:JH2L:DU5O:VLQJ:HPNV:KRRT:6U7J:ZN5S:JP2O:QBEW:PH7C:KEEI
Username: guyellis
Registry: https://index.docker.io/v1/
WARNING: No swap limit support

A reboot "solved" the problem inasmuch as I execute the problem command right away and nothing else has "tainted" the Docker environment. This has happened multiple times and I think I can reproduce it if you need it reproduced.

travisperson commented 9 years ago

Same here (restarting docker daemon solves the issue however)

I ran into this issue, but a restart (of docker) did not solve it. This "error" can also be returned if behind a corporate firewall. I had to add their SSL cert, much like @itzg mentioned he did.

joshuarichard commented 9 years ago

It's not like this issue needs more people saying it, but I'm running into this issue as well with a corporate registry. Stalls when pulling fs layer (probably unrelated to this issue) either at the beginning or halfway through download. ctrl+c and rebuild leads to:

Layer already being pulled by another client. Waiting.

Restart of Kitematic doesn't solve the problem and obviously can't restart registry server. Reboot does get rid of it though.

anuaimi commented 9 years ago

seeing this as well.

dansondergaard commented 9 years ago

Also experiencing this issue.

beginrescueend commented 9 years ago

Seeing it right now. Corporate firewall issue? Docker Hub slow?

euank commented 9 years ago

I'll throw in some info around the times I've seen this issue as well, though mine are mostly older and I don't have a solid repro either. When I was running into it fairly frequently in 1.6.2, I got the this stack trace. I ran into something similar in 1.8.0-dev / commit c986f85 related to the error channel which was fixed by #15545. Stack trace for this can be found here.

The second stack trace definitely isn't relevant, and the first being 1.6.2 is probably not all that relevant, but it would be great to see others who run into this issue on recent dockers run sudo kill -SIGUSR1 $(pidof docker) and post the stacktrace written to Docker's logfile.

itzg commented 9 years ago

Here's the stack trace on the system where I consistently see this:

INFO[0138] === BEGIN goroutine stack dump ===
goroutine 12 [running]:
github.com/docker/docker/pkg/signal.DumpStacks()
        /usr/src/docker/.gopath/src/github.com/docker/docker/pkg/signal/trap.go:60 +0x7a
github.com/docker/docker/daemon.func·025()
        /usr/src/docker/.gopath/src/github.com/docker/docker/daemon/debugtrap_unix.go:18 +0x6d
created by github.com/docker/docker/daemon.setupDumpStackTrap
        /usr/src/docker/.gopath/src/github.com/docker/docker/daemon/debugtrap_unix.go:20 +0x18e

goroutine 1 [chan receive]:
main.(*DaemonCli).CmdDaemon(0xc20808de10, 0xc20800a020, 0x2, 0x2, 0x0, 0x0)
        /usr/src/docker/docker/daemon.go:292 +0x1811
reflect.callMethod(0xc2080c9da0, 0xc2081f1ce0)
        /usr/local/go/src/reflect/value.go:605 +0x179
reflect.methodValueCall(0xc20800a020, 0x2, 0x2, 0x1, 0xc2080c9da0, 0x0, 0x0, 0xc2080c9da0, 0x0, 0x45128f, ...)
        /usr/local/go/src/reflect/asm_amd64.s:29 +0x36
github.com/docker/docker/cli.(*Cli).Run(0xc2080c9ce0, 0xc20800a010, 0x3, 0x3, 0x0, 0x0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/cli/cli.go:89 +0x38e
main.main()
        /usr/src/docker/docker/docker.go:69 +0x428

goroutine 5 [syscall]:
os/signal.loop()
        /usr/local/go/src/os/signal/signal_unix.go:21 +0x1f
created by os/signal.init·1
        /usr/local/go/src/os/signal/signal_unix.go:27 +0x35

goroutine 17 [syscall, 2 minutes, locked to thread]:
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:2232 +0x1

goroutine 11 [IO wait]:
net.(*pollDesc).Wait(0xc208195b80, 0x72, 0x0, 0x0)
        /usr/local/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc208195b80, 0x0, 0x0)
        /usr/local/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).accept(0xc208195b20, 0x0, 0x7f0222495340, 0xc2081b8950)
        /usr/local/go/src/net/fd_unix.go:419 +0x40b
net.(*UnixListener).AcceptUnix(0xc2080681e0, 0xc2081606c0, 0x0, 0x0)
        /usr/local/go/src/net/unixsock_posix.go:282 +0x56
net.(*UnixListener).Accept(0xc2080681e0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/unixsock_posix.go:293 +0x4c
github.com/docker/docker/pkg/listenbuffer.(*defaultListener).Accept(0xc208068220, 0x0, 0x0, 0x0, 0x0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/pkg/listenbuffer/buffer.go:71 +0x67
net/http.(*Server).Serve(0xc20806c5a0, 0x7f022249d398, 0xc208068220, 0x0, 0x0)
        /usr/local/go/src/net/http/server.go:1728 +0x92
github.com/docker/docker/api/server.(*HttpServer).Serve(0xc208068a20, 0x0, 0x0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/api/server/server.go:126 +0x4d
github.com/docker/docker/api/server.func·002(0x7f022249d3f8, 0xc208068a20)
        /usr/src/docker/.gopath/src/github.com/docker/docker/api/server/server.go:102 +0x40
created by github.com/docker/docker/api/server.(*Server).ServeApi
        /usr/src/docker/.gopath/src/github.com/docker/docker/api/server/server.go:106 +0x6ac

goroutine 10 [chan receive, 2 minutes]:
github.com/docker/docker/api/server.(*Server).ServeApi(0xc20806b940, 0xc2080aa3e0, 0x1, 0x1, 0x0, 0x0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/api/server/server.go:111 +0x74f
main.func·007()
        /usr/src/docker/docker/daemon.go:242 +0x5b
created by main.(*DaemonCli).CmdDaemon
        /usr/src/docker/docker/daemon.go:248 +0xd51

goroutine 14 [IO wait]:
net.(*pollDesc).Wait(0xc2081084c0, 0x72, 0x0, 0x0)
        /usr/local/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc2081084c0, 0x0, 0x0)
        /usr/local/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).readMsg(0xc208108460, 0xc2081da0c0, 0x10, 0x10, 0xc208210020, 0x1000, 0x1000, 0xffffffffffffffff, 0x0, 0x0, ...)
        /usr/local/go/src/net/fd_unix.go:296 +0x54e
net.(*UnixConn).ReadMsgUnix(0xc208038430, 0xc2081da0c0, 0x10, 0x10, 0xc208210020, 0x1000, 0x1000, 0x51, 0xc20820de64, 0x4, ...)
        /usr/local/go/src/net/unixsock_posix.go:147 +0x167
github.com/godbus/dbus.(*oobReader).Read(0xc208210000, 0xc2081da0c0, 0x10, 0x10, 0xc208210000, 0x0, 0x0)
        /usr/src/docker/vendor/src/github.com/godbus/dbus/transport_unix.go:21 +0xc5
io.ReadAtLeast(0x7f022249eb78, 0xc208210000, 0xc2081da0c0, 0x10, 0x10, 0x10, 0x0, 0x0, 0x0)
        /usr/local/go/src/io/io.go:298 +0xf1
io.ReadFull(0x7f022249eb78, 0xc208210000, 0xc2081da0c0, 0x10, 0x10, 0x51, 0x0, 0x0)
        /usr/local/go/src/io/io.go:316 +0x6d
github.com/godbus/dbus.(*unixTransport).ReadMessage(0xc20820c3f0, 0xc20820e390, 0x0, 0x0)
        /usr/src/docker/vendor/src/github.com/godbus/dbus/transport_unix.go:85 +0x1bf
github.com/godbus/dbus.(*Conn).inWorker(0xc208075680)
        /usr/src/docker/vendor/src/github.com/godbus/dbus/conn.go:248 +0x58
created by github.com/godbus/dbus.(*Conn).Auth
        /usr/src/docker/vendor/src/github.com/godbus/dbus/auth.go:118 +0xe84

goroutine 15 [chan receive]:
github.com/godbus/dbus.(*Conn).outWorker(0xc208075680)
        /usr/src/docker/vendor/src/github.com/godbus/dbus/conn.go:370 +0x58
created by github.com/godbus/dbus.(*Conn).Auth
        /usr/src/docker/vendor/src/github.com/godbus/dbus/auth.go:119 +0xea1

goroutine 16 [chan receive]:
github.com/docker/libnetwork/iptables.signalHandler()
        /usr/src/docker/vendor/src/github.com/docker/libnetwork/iptables/firewalld.go:92 +0x57
created by github.com/docker/libnetwork/iptables.FirewalldInit
        /usr/src/docker/vendor/src/github.com/docker/libnetwork/iptables/firewalld.go:48 +0x185

goroutine 51 [chan receive]:
database/sql.(*DB).connectionOpener(0xc2081aeb40)
        /usr/local/go/src/database/sql/sql.go:589 +0x4c
created by database/sql.Open
        /usr/local/go/src/database/sql/sql.go:452 +0x31c

goroutine 55 [chan receive]:
github.com/docker/docker/daemon.(*Daemon).execCommandGC(0xc20807d6c0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/daemon/exec.go:256 +0x8c
created by github.com/docker/docker/daemon.NewDaemon
        /usr/src/docker/.gopath/src/github.com/docker/docker/daemon/daemon.go:742 +0x2518

goroutine 54 [chan receive]:
github.com/docker/docker/daemon.(*statsCollector).run(0xc208243e60)
        /usr/src/docker/.gopath/src/github.com/docker/docker/daemon/stats_collector_unix.go:91 +0xb2
created by github.com/docker/docker/daemon.newStatsCollector
        /usr/src/docker/.gopath/src/github.com/docker/docker/daemon/stats_collector_unix.go:31 +0x116

goroutine 58 [chan receive]:
github.com/docker/docker/pkg/signal.func·002()
        /usr/src/docker/.gopath/src/github.com/docker/docker/pkg/signal/trap.go:29 +0x8f
created by github.com/docker/docker/pkg/signal.Trap
        /usr/src/docker/.gopath/src/github.com/docker/docker/pkg/signal/trap.go:55 +0x250

goroutine 70 [semacquire]:
sync.(*Cond).Wait(0xc20823c3f0)
        /usr/local/go/src/sync/cond.go:62 +0x9e
io.(*pipe).read(0xc20823c3c0, 0xc2082e8000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
        /usr/local/go/src/io/pipe.go:52 +0x303
io.(*PipeReader).Read(0xc208038630, 0xc2082e8000, 0x8000, 0x8000, 0x8000, 0x0, 0x0)
        /usr/local/go/src/io/pipe.go:134 +0x5b
io.Copy(0x7f022249fc98, 0xc208267bc0, 0x7f02224a0888, 0xc208038630, 0x0, 0x0, 0x0)
        /usr/local/go/src/io/io.go:362 +0x1f6
github.com/docker/docker/graph.func·011()
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v2.go:224 +0x77
created by github.com/docker/docker/graph.(*v2Puller).pullV2Tag
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v2.go:227 +0x5d9

goroutine 72 [chan receive]:
github.com/docker/docker/graph.(*v1Puller).pullRepository(0xc2081dda00, 0xc2081b8b37, 0x6, 0x0, 0x0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v1.go:204 +0x117d
github.com/docker/docker/graph.(*v1Puller).Pull(0xc2081dda00, 0xc2081b8b37, 0x6, 0x17, 0x0, 0x0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v1.go:58 +0x6bb
github.com/docker/docker/graph.(*TagStore).Pull(0xc2082213e0, 0xc2081b8b30, 0x6, 0xc2081b8b37, 0x6, 0xc208160c20, 0x0, 0x0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull.go:100 +0x719
github.com/docker/docker/api/server.(*Server).postImagesCreate(0xc20806b940, 0xc20806b747, 0x4, 0x7f022249fad0, 0xc2081ce820, 0xc208257c70, 0xc208243440, 0x0, 0x0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/api/server/server.go:775 +0xc49
github.com/docker/docker/api/server.*Server.(github.com/docker/docker/api/server.postImagesCreate)·fm(0xc20806b747, 0x4, 0x7f022249fad0, 0xc2081ce820, 0xc208257c70, 0xc208243440, 0x0, 0x0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/api/server/server.go:1658 +0x7b
github.com/docker/docker/api/server.func·008(0x7f022249fad0, 0xc2081ce820, 0xc208257c70)
        /usr/src/docker/.gopath/src/github.com/docker/docker/api/server/server.go:1614 +0xc8f
net/http.HandlerFunc.ServeHTTP(0xc2081be680, 0x7f022249fad0, 0xc2081ce820, 0xc208257c70)
        /usr/local/go/src/net/http/server.go:1265 +0x41
github.com/gorilla/mux.(*Router).ServeHTTP(0xc2080ccbe0, 0x7f022249fad0, 0xc2081ce820, 0xc208257c70)
        /usr/src/docker/vendor/src/github.com/gorilla/mux/mux.go:98 +0x297
net/http.serverHandler.ServeHTTP(0xc20806c5a0, 0x7f022249fad0, 0xc2081ce820, 0xc208257c70)
        /usr/local/go/src/net/http/server.go:1703 +0x19a
net/http.(*conn).serve(0xc2081ce6e0)
        /usr/local/go/src/net/http/server.go:1204 +0xb57
created by net/http.(*Server).Serve
        /usr/local/go/src/net/http/server.go:1751 +0x35e

goroutine 102 [chan send]:
github.com/docker/docker/graph.(*v2Puller).download(0xc2083f15e0, 0xc2081cc410)
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v2.go:148 +0x954
created by github.com/docker/docker/graph.(*v2Puller).pullV2Tag
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v2.go:269 +0x20c1

goroutine 103 [chan send]:
github.com/docker/docker/graph.(*v2Puller).download(0xc2083f15e0, 0xc2081cc3c0)
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v2.go:148 +0x954
created by github.com/docker/docker/graph.(*v2Puller).pullV2Tag
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v2.go:269 +0x20c1

goroutine 144 [chan receive]:
github.com/docker/docker/graph.func·010(0xc20817d240)
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v1.go:143 +0x411
created by github.com/docker/docker/graph.(*v1Puller).pullRepository
        /usr/src/docker/.gopath/src/github.com/docker/docker/graph/pull_v1.go:199 +0x10d1

=== END goroutine stack dump ===
elyulka commented 9 years ago

I have also experienced hang on "Layer already being pulled by another client. Waiting." docker 1.8.1, build d12ea79 Here is stack trace.

time="2015-08-25T18:49:46.644301799Z" level=info msg="=== BEGIN goroutine stack dump ===\ngoroutine 11 [running]:\ngithub.com/docker/docker/pkg/signal.DumpStacks()\n\t/go/src/github.com/docker/docker/pkg/signal/trap.go:60 +0x7a\ngithub.com/docker/docker/daemon.func·025()\n\t/go/src/github.com/docker/docker/daemon/debugtrap_unix.go:18 +0x6d\ncreated by github.com/docker/docker/daemon.setupDumpStackTrap\n\t/go/src/github.com/docker/docker/daemon/debugtrap_unix.go:20 +0x18e\n\ngoroutine 1 [chan receive, 130 minutes]:\nmain.(*DaemonCli).CmdDaemon(0xc2080290c0, 0xc208000220, 0x0, 0x0, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/docker/daemon.go:292 +0x1811\nmain.handleGlobalDaemonFlag()\n\t/go/src/github.com/docker/docker/docker/daemon.go:65 +0x22c\nmain.main()\n\t/go/src/github.com/docker/docker/docker/docker.go:59 +0x2e5\n\ngoroutine 5 [syscall]:\nos/signal.loop()\n\t/usr/local/go/src/os/signal/signal_unix.go:21 +0x1f\ncreated by os/signal.init·1\n\t/usr/local/go/src/os/signal/signal_unix.go:27 +0x35\n\ngoroutine 17 [syscall, 132 minutes, locked to thread]:\nruntime.goexit()\n\t/usr/local/go/src/runtime/asm_amd64.s:2232 +0x1\n\ngoroutine 10 [chan receive, 132 minutes]:\ngithub.com/docker/docker/api/server.(*Server).ServeApi(0xc20815ed80, 0xc2080aedc0, 0x2, 0x2, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/api/server/server.go:111 +0x74f\nmain.func·007()\n\t/go/src/github.com/docker/docker/docker/daemon.go:242 +0x5b\ncreated by main.(*DaemonCli).CmdDaemon\n\t/go/src/github.com/docker/docker/docker/daemon.go:248 +0xd51\n\ngoroutine 12 [IO wait, 71 minutes]:\nnet.(*pollDesc).Wait(0xc2081fc060, 0x72, 0x0, 0x0)\n\t/usr/local/go/src/net/fd_poll_runtime.go:84 +0x47\nnet.(*pollDesc).WaitRead(0xc2081fc060, 0x0, 0x0)\n\t/usr/local/go/src/net/fd_poll_runtime.go:89 +0x43\nnet.(*netFD).accept(0xc2081fc000, 0x0, 0x7fae649f1340, 0xc208451a50)\n\t/usr/local/go/src/net/fd_unix.go:419 +0x40b\nnet.(*UnixListener).AcceptUnix(0xc20815c1a0, 0xc20844d860, 0x0, 0x0)\n\t/usr/local/go/src/net/unixsock_posix.go:282 +0x56\nnet.(*UnixListener).Accept(0xc20815c1a0, 0x0, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/net/unixsock_posix.go:293 +0x4c\ngithub.com/docker/docker/pkg/listenbuffer.(*defaultListener).Accept(0xc20815c1c0, 0x0, 0x0, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/pkg/listenbuffer/buffer.go:71 +0x67\nnet/http.(*Server).Serve(0xc20814e0c0, 0x7fae649f9d58, 0xc20815c1c0, 0x0, 0x0)\n\t/usr/local/go/src/net/http/server.go:1728 +0x92\ngithub.com/docker/docker/api/server.(*HttpServer).Serve(0xc20815c440, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/api/server/server.go:126 +0x4d\ngithub.com/docker/docker/api/server.func·002(0x7fae649f9db8, 0xc20815c440)\n\t/go/src/github.com/docker/docker/api/server/server.go:102 +0x40\ncreated by github.com/docker/docker/api/server.(*Server).ServeApi\n\t/go/src/github.com/docker/docker/api/server/server.go:106 +0x6ac\n\ngoroutine 13 [IO wait, 4 minutes]:\nnet.(*pollDesc).Wait(0xc2081fc300, 0x72, 0x0, 0x0)\n\t/usr/local/go/src/net/fd_poll_runtime.go:84 +0x47\nnet.(*pollDesc).WaitRead(0xc2081fc300, 0x0, 0x0)\n\t/usr/local/go/src/net/fd_poll_runtime.go:89 +0x43\nnet.(*netFD).accept(0xc2081fc2a0, 0x0, 0x7fae649f1340, 0xc2086e8620)\n\t/usr/local/go/src/net/fd_unix.go:419 +0x40b\nnet.(*TCPListener).AcceptTCP(0xc208038060, 0xc2086e8610, 0x0, 0x0)\n\t/usr/local/go/src/net/tcpsock_posix.go:234 +0x4e\nnet.(*TCPListener).Accept(0xc208038060, 0x0, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/net/tcpsock_posix.go:244 +0x4c\ngithub.com/docker/docker/pkg/listenbuffer.(*defaultListener).Accept(0xc20815c520, 0x0, 0x0, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/pkg/listenbuffer/buffer.go:71 +0x67\ncrypto/tls.(*listener).Accept(0xc20815c540, 0x0, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/crypto/tls/tls.go:46 +0x6b\nnet/http.(*Server).Serve(0xc20814e300, 0x7fae649f9ec8, 0xc20815c540, 0x0, 0x0)\n\t/usr/local/go/src/net/http/server.go:1728 +0x92\ngithub.com/docker/docker/api/server.(*HttpServer).Serve(0xc20815c700, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/api/server/server.go:126 +0x4d\ngithub.com/docker/docker/api/server.func·002(0x7fae649f9db8, 0xc20815c700)\n\t/go/src/github.com/docker/docker/api/server/server.go:102 +0x40\ncreated by github.com/docker/docker/api/server.(*Server).ServeApi\n\t/go/src/github.com/docker/docker/api/server/server.go:106 +0x6ac\n\ngoroutine 49 [chan receive, 132 minutes]:\ndatabase/sql.(*DB).connectionOpener(0xc2081c4e60)\n\t/usr/local/go/src/database/sql/sql.go:589 +0x4c\ncreated by database/sql.Open\n\t/usr/local/go/src/database/sql/sql.go:452 +0x31c\n\ngoroutine 50 [chan receive]:\ngithub.com/docker/docker/daemon.(*statsCollector).run(0xc20828c000)\n\t/go/src/github.com/docker/docker/daemon/stats_collector_unix.go:91 +0xb2\ncreated by github.com/docker/docker/daemon.newStatsCollector\n\t/go/src/github.com/docker/docker/daemon/stats_collector_unix.go:31 +0x116\n\ngoroutine 51 [chan receive, 2 minutes]:\ngithub.com/docker/docker/daemon.(*Daemon).execCommandGC(0xc20806e380)\n\t/go/src/github.com/docker/docker/daemon/exec.go:256 +0x8c\ncreated by github.com/docker/docker/daemon.NewDaemon\n\t/go/src/github.com/docker/docker/daemon/daemon.go:742 +0x2518\n\ngoroutine 589 [syscall, 78 minutes]:\nsyscall.Syscall(0x0, 0x17, 0xc20867c000, 0x8000, 0x0, 0x8000, 0xc200000001)\n\t/usr/local/go/src/syscall/asm_linux_amd64.s:21 +0x5\nsyscall.read(0x17, 0xc20867c000, 0x8000, 0x8000, 0x18b29f0, 0x0, 0x0)\n\t/usr/local/go/src/syscall/zsyscall_linux_amd64.go:867 +0x6e\nsyscall.Read(0x17, 0xc20867c000, 0x8000, 0x8000, 0xdd0d80, 0x0, 0x0)\n\t/usr/local/go/src/syscall/syscall_unix.go:136 +0x58\nos.(*File).read(0xc208648fc0, 0xc20867c000, 0x8000, 0x8000, 0xc20867c000, 0x0, 0x0)\n\t/usr/local/go/src/os/file_unix.go:191 +0x5e\nos.(*File).Read(0xc208648fc0, 0xc20867c000, 0x8000, 0x8000, 0x8000, 0x0, 0x0)\n\t/usr/local/go/src/os/file.go:95 +0x91\nio.Copy(0x7fae649fcce0, 0xc20809cd00, 0x7fae649f12f0, 0xc208648fc0, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/io/io.go:362 +0x1f6\nos/exec.func·003(0x0, 0x0)\n\t/usr/local/go/src/os/exec/exec.go:221 +0x7d\nos/exec.func·004(0xc208316300)\n\t/usr/local/go/src/os/exec/exec.go:328 +0x2d\ncreated by os/exec.(*Cmd).Start\n\t/usr/local/go/src/os/exec/exec.go:329 +0xb6d\n\ngoroutine 584 [semacquire, 78 minutes]:\nsync.(*Cond).Wait(0xc208bf47b0)\n\t/usr/local/go/src/sync/cond.go:62 +0x9e\nio.(*pipe).read(0xc208bf4780, 0xc20838e000, 0x400, 0x400, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/io/pipe.go:52 +0x303\nio.(*PipeReader).Read(0xc208648f20, 0xc20838e000, 0x400, 0x400, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/io/pipe.go:134 +0x5b\ngithub.com/docker/docker/pkg/ioutils.(*bufReader).drain(0xc208bf4900)\n\t/go/src/github.com/docker/docker/pkg/ioutils/readers.go:116 +0x10e\ncreated by github.com/docker/docker/pkg/ioutils.NewBufReader\n\t/go/src/github.com/docker/docker/pkg/ioutils/readers.go:86 +0x2f3\n\ngoroutine 585 [semacquire, 71 minutes]:\nsync.(*Cond).Wait(0xc208bf46f0)\n\t/usr/local/go/src/sync/cond.go:62 +0x9e\ngithub.com/docker/docker/pkg/ioutils.(*bufReader).Read(0xc208bf46c0, 0xc2086a9000, 0x1000, 0x1000, 0x0, 0x7fae649ed080, 0xc2080280d0)\n\t/go/src/github.com/docker/docker/pkg/ioutils/readers.go:210 +0x158\nbufio.(*Reader).fill(0xc208192840)\n\t/usr/local/go/src/bufio/bufio.go:97 +0x1ce\nbufio.(*Reader).ReadSlice(0xc208192840, 0x63f30a, 0x0, 0x0, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/bufio/bufio.go:295 +0x257\nbufio.(*Reader).ReadBytes(0xc208192840, 0xc2086f700a, 0x0, 0x0, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/bufio/bufio.go:374 +0xd2\ngithub.com/docker/docker/daemon/logger.(*Copier).copySrc(0xc20823fa80, 0x10d2300, 0x6, 0x7fae649fccb8, 0xc208bf46c0)\n\t/go/src/github.com/docker/docker/daemon/logger/copier.go:47 +0x96\ncreated by github.com/docker/docker/daemon/logger.(*Copier).Run\n\t/go/src/github.com/docker/docker/daemon/logger/copier.go:38 +0x11c\n\ngoroutine 586 [semacquire, 78 minutes]:\nsync.(*Cond).Wait(0xc208bf4930)\n\t/usr/local/go/src/sync/cond.go:62 +0x9e\ngithub.com/docker/docker/pkg/ioutils.(*bufReader).Read(0xc208bf4900, 0xc20866a000, 0x1000, 0x1000, 0x0, 0x7fae649ed080, 0xc2080280d0)\n\t/go/src/github.com/docker/docker/pkg/ioutils/readers.go:210 +0x158\nbufio.(*Reader).fill(0xc208192900)\n\t/usr/local/go/src/bufio/bufio.go:97 +0x1ce\nbufio.(*Reader).ReadSlice(0xc208192900, 0x41da0a, 0x0, 0x0, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/bufio/bufio.go:295 +0x257\nbufio.(*Reader).ReadBytes(0xc208192900, 0xc208bf490a, 0x0, 0x0, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/bufio/bufio.go:374 +0xd2\ngithub.com/docker/docker/daemon/logger.(*Copier).copySrc(0xc20823fa80, 0x10d22c0, 0x6, 0x7fae649fccb8, 0xc208bf4900)\n\t/go/src/github.com/docker/docker/daemon/logger/copier.go:47 +0x96\ncreated by github.com/docker/docker/daemon/logger.(*Copier).Run\n\t/go/src/github.com/docker/docker/daemon/logger/copier.go:38 +0x11c\n\ngoroutine 588 [syscall, 71 minutes]:\nsyscall.Syscall(0x0, 0x14, 0xc208570000, 0x8000, 0x63f30b, 0xc20809cdf0, 0xffffffff)\n\t/usr/local/go/src/syscall/asm_linux_amd64.s:21 +0x5\nsyscall.read(0x14, 0xc208570000, 0x8000, 0x8000, 0xc208570000, 0x0, 0x0)\n\t/usr/local/go/src/syscall/zsyscall_linux_amd64.go:867 +0x6e\nsyscall.Read(0x14, 0xc208570000, 0x8000, 0x8000, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/syscall/syscall_unix.go:136 +0x58\nos.(*File).read(0xc208648fa0, 0xc208570000, 0x8000, 0x8000, 0xc20853cfc0, 0x0, 0x0)\n\t/usr/local/go/src/os/file_unix.go:191 +0x5e\nos.(*File).Read(0xc208648fa0, 0xc208570000, 0x8000, 0x8000, 0x460, 0x0, 0x0)\n\t/usr/local/go/src/os/file.go:95 +0x91\nio.Copy(0x7fae649fcce0, 0xc20809cdf0, 0x7fae649f12f0, 0xc208648fa0, 0x1484, 0x0, 0x0)\n\t/usr/local/go/src/io/io.go:362 +0x1f6\nos/exec.func·003(0x0, 0x0)\n\t/usr/local/go/src/os/exec/exec.go:221 +0x7d\nos/exec.func·004(0xc2083162a0)\n\t/usr/local/go/src/os/exec/exec.go:328 +0x2d\ncreated by os/exec.(*Cmd).Start\n\t/usr/local/go/src/os/exec/exec.go:329 +0xb6d\n\ngoroutine 590 [syscall, 78 minutes]:\nsyscall.Syscall(0x0, 0x2e, 0xc20819f750, 0x8, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/syscall/asm_linux_amd64.s:21 +0x5\nsyscall.read(0x2e, 0xc20819f750, 0x8, 0x8, 0x7fae649f76d8, 0x0, 0x0)\n\t/usr/local/go/src/syscall/zsyscall_linux_amd64.go:867 +0x6e\nsyscall.Read(0x2e, 0xc20819f750, 0x8, 0x8, 0xc20814c5d0, 0x0, 0x0)\n\t/usr/local/go/src/syscall/syscall_unix.go:136 +0x58\nos.(*File).read(0xc2086492e0, 0xc20819f750, 0x8, 0x8, 0xc208066000, 0x0, 0x0)\n\t/usr/local/go/src/os/file_unix.go:191 +0x5e\nos.(*File).Read(0xc2086492e0, 0xc20819f750, 0x8, 0x8, 0xc2083aaa90, 0x0, 0x0)\n\t/usr/local/go/src/os/file.go:95 +0x91\ngithub.com/opencontainers/runc/libcontainer.func·009()\n\t/go/src/github.com/docker/docker/vendor/src/github.com/opencontainers/runc/libcontainer/notify_linux.go:51 +0x18c\ncreated by github.com/opencontainers/runc/libcontainer.notifyOnOOM\n\t/go/src/github.com/docker/docker/vendor/src/github.com/opencontainers/runc/libcontainer/notify_linux.go:61 +0x887\n\ngoroutine 582 [syscall, 78 minutes]:\nsyscall.Syscall6(0x3d, 0x7de, 0xc2086cf4ec, 0x0, 0xc2086734d0, 0x0, 0x0, 0x44322c, 0x4436b2, 0xf1f6e0)\n\t/usr/local/go/src/syscall/asm_linux_amd64.s:46 +0x5\nsyscall.wait4(0x7de, 0xc2086cf4ec, 0x0, 0xc2086734d0, 0x90, 0x0, 0x0)\n\t/usr/local/go/src/syscall/zsyscall_linux_amd64.go:124 +0x79\nsyscall.Wait4(0x7de, 0xc2086cf534, 0x0, 0xc2086734d0, 0x41becb, 0x0, 0x0)\n\t/usr/local/go/src/syscall/syscall_linux.go:224 +0x60\nos.(*Process).wait(0xc208316460, 0xc208192e60, 0x0, 0x0)\n\t/usr/local/go/src/os/exec_unix.go:22 +0x103\nos.(*Process).Wait(0xc208316460, 0xc2086cf5f8, 0x0, 0x0)\n\t/usr/local/go/src/os/doc.go:45 +0x3a\nos/exec.(*Cmd).Wait(0xc2086f4dc0, 0x0, 0x0)\n\t/usr/local/go/src/os/exec/exec.go:364 +0x23c\ngithub.com/opencontainers/runc/libcontainer.(*initProcess).wait(0xc20845a640, 0x1b6, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/vendor/src/github.com/opencontainers/runc/libcontainer/process_linux.go:224 +0x3d\ngithub.com/opencontainers/runc/libcontainer.Process.Wait(0xc2082247b0, 0x3, 0x3, 0xc20825e240, 0x24, 0x24, 0x18c1a00, 0x0, 0x18c1a00, 0x0, ...)\n\t/go/src/github.com/docker/docker/vendor/src/github.com/opencontainers/runc/libcontainer/process.go:60 +0x11d\ngithub.com/opencontainers/runc/libcontainer.Process.Wait·fm(0xc2086cfac0, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/daemon/execdriver/native/driver.go:173 +0x58\ngithub.com/docker/docker/daemon/execdriver/native.(*driver).Run(0xc208271720, 0xc208344e00, 0xc208224660, 0xc2083aaaa0, 0x0, 0x41c000, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/daemon/execdriver/native/driver.go:179 +0xa0a\ngithub.com/docker/docker/daemon.(*Daemon).Run(0xc20806e380, 0xc2084f81e0, 0xc208224660, 0xc2083aaaa0, 0x0, 0x0, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/daemon/daemon.go:821 +0x95\ngithub.com/docker/docker/daemon.(*containerMonitor).Start(0xc2081ba3f0, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/daemon/monitor.go:141 +0x457\ngithub.com/docker/docker/daemon.*containerMonitor.Start·fm(0x0, 0x0)\n\t/go/src/github.com/docker/docker/daemon/container.go:779 +0x39\ngithub.com/docker/docker/pkg/promise.func·001()\n\t/go/src/github.com/docker/docker/pkg/promise/promise.go:8 +0x2f\ncreated by github.com/docker/docker/pkg/promise.Go\n\t/go/src/github.com/docker/docker/pkg/promise/promise.go:9 +0xfb\n\ngoroutine 65 [select]:\ngithub.com/docker/libnetwork/sandbox.removeUnusedPaths()\n\t/go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/sandbox/namespace_linux.go:71 +0x48b\ncreated by github.com/docker/libnetwork/sandbox.createBasePath\n\t/go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/sandbox/namespace_linux.go:56 +0xc9\n\ngoroutine 583 [semacquire, 71 minutes]:\nsync.(*Cond).Wait(0xc208bf4570)\n\t/usr/local/go/src/sync/cond.go:62 +0x9e\nio.(*pipe).read(0xc208bf4540, 0xc208251c00, 0x400, 0x400, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/io/pipe.go:52 +0x303\nio.(*PipeReader).Read(0xc208648f10, 0xc208251c00, 0x400, 0x400, 0x60, 0x0, 0x0)\n\t/usr/local/go/src/io/pipe.go:134 +0x5b\ngithub.com/docker/docker/pkg/ioutils.(*bufReader).drain(0xc208bf46c0)\n\t/go/src/github.com/docker/docker/pkg/ioutils/readers.go:116 +0x10e\ncreated by github.com/docker/docker/pkg/ioutils.NewBufReader\n\t/go/src/github.com/docker/docker/pkg/ioutils/readers.go:86 +0x2f3\n\ngoroutine 75 [chan receive, 130 minutes]:\ngithub.com/docker/docker/pkg/signal.func·002()\n\t/go/src/github.com/docker/docker/pkg/signal/trap.go:29 +0x8f\ncreated by github.com/docker/docker/pkg/signal.Trap\n\t/go/src/github.com/docker/docker/pkg/signal/trap.go:55 +0x250\n\ngoroutine 573 [chan receive, 63 minutes]:\ngithub.com/docker/docker/graph.(*v1Puller).pullRepository(0xc208255c80, 0xc2083cf41e, 0x28, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/graph/pull_v1.go:204 +0x117d\ngithub.com/docker/docker/graph.(*v1Puller).Pull(0xc208255c80, 0xc2083cf41e, 0x28, 0x17, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/graph/pull_v1.go:58 +0x6bb\ngithub.com/docker/docker/graph.(*TagStore).Pull(0xc2081495c0, 0xc208456f20, 0x15, 0xc2083cf41e, 0x28, 0xc208456f40, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/graph/pull.go:100 +0x719\ngithub.com/docker/docker/api/server.(*Server).postImagesCreate(0xc20815ed80, 0xc2083cf407, 0x4, 0x7fae64a01ef0, 0xc2086c4960, 0xc2088e96c0, 0xc208479e90, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/api/server/server.go:775 +0xc49\ngithub.com/docker/docker/api/server.*Server.(github.com/docker/docker/api/server.postImagesCreate)·fm(0xc2083cf407, 0x4, 0x7fae64a01ef0, 0xc2086c4960, 0xc2088e96c0, 0xc208479e90, 0x0, 0x0)\n\t/go/src/github.com/docker/docker/api/server/server.go:1658 +0x7b\ngithub.com/docker/docker/api/server.func·008(0x7fae64a01ef0, 0xc2086c4960, 0xc2088e96c0)\n\t/go/src/github.com/docker/docker/api/server/server.go:1614 +0xc8f\nnet/http.HandlerFunc.ServeHTTP(0xc208035480, 0x7fae64a01ef0, 0xc2086c4960, 0xc2088e96c0)\n\t/usr/local/go/src/net/http/server.go:1265 +0x41\ngithub.com/gorilla/mux.(*Router).ServeHTTP(0xc20815b0e0, 0x7fae64a01ef0, 0xc2086c4960, 0xc2088e96c0)\n\t/go/src/github.com/docker/docker/vendor/src/github.com/gorilla/mux/mux.go:98 +0x297\nnet/http.serverHandler.ServeHTTP(0xc20814e300, 0x7fae64a01ef0, 0xc2086c4960, 0xc2088e96c0)\n\t/usr/local/go/src/net/http/server.go:1703 +0x19a\nnet/http.(*conn).serve(0xc2084fbc20)\n\t/usr/local/go/src/net/http/server.go:1204 +0xb57\ncreated by net/http.(*Server).Serve\n\t/usr/local/go/src/net/http/server.go:1751 +0x35e\n\ngoroutine 102 [semacquire, 117 minutes]:\nsync.(*Cond).Wait(0xc20800bbf0)\n\t/usr/local/go/src/sync/cond.go:62 +0x9e\nio.(*pipe).read(0xc20800bbc0, 0xc208556000, 0x8000, 0x8000, 0x0, 0x0, 0x0)\n\t/usr/local/go/src/io/pipe.go:52 +0x303\nio.(*PipeReader).Read(0xc208038c98, 0xc208556000, 0x8000, 0x8000, 0x46, 0x0, 0x0)\n\t/usr/local/go/src/io/pipe.go:134 +0x5b\nio.Copy(0x7fae64a023a0, 0xc20824c0f0, 0x7fae649fcc60, 0xc208038c98, 0x18319, 0x0, 0x0)\n\t/usr/local/go/src/io/io.go:362 +0x1f6\ngithub.com/docker/docker/graph.func·011()\n\t/go/\n=== END goroutine stack dump ===" 
gengmao commented 9 years ago

@euank Could you please point out where is the stack dump? I tried sudo kill -SIGUSR1 $(pidof docker) but couldn't find stack dump location.

itzg commented 9 years ago

@gengmao, I couldn't find it first either :). On upstart systems (Ubuntu, pre-15.04), the docker.log is under /var/log/upstart.

gengmao commented 9 years ago

@itzg Now I see it in /var/log/upstart/docker.log. Thanks! My stack dump is similar to yours.

Starefossen commented 9 years ago

I am experiencing this exact same issue with Docker 1.8.1 on Ubuntu 14.04.

tobkle commented 9 years ago

Same problem while pulling Ubuntu 14.04. Doesn't come to end.

xnox commented 9 years ago

me too on clearlinux.org

itzg commented 9 years ago

Since it's a slow day at work and the corporate firewall (SSL cert swapping) seems to be a contributing factor, I think I'll poke around in the Docker code today ;)

itzg commented 9 years ago

With debugs enabled on the daemon, I can now see it is failing with a cert mismatch type of error and repeats every few seconds:

DEBU[0049] Error statting layer: Head https://dseasb33srnrn.cloudfront.net/registry-v2/docker/registry/v2/blobs/sha256/a3/a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4/data?Expires=1441032943&Signature=...snip...&Key-Pair-Id=...snip...: x509: certificate signed by unknown authority 

even with /etc/ssl/certs/ca-certificates.crt updated with my company's proxy certs. It's acting as if the OS' intermediate CAs are not utilized like before.

itzg commented 9 years ago

Something might have changed with Ubuntu 14.04.3 itself. After discovering wget wasn't handling https either, I noticed the update-ca-certificates mention the CA bundle was compiled from /usr/local/share/ca-certificates.

So I put my PEM files in that directory, renamed their suffix to ".crt", and re-ran update-ca-certificates. Now I am able to run/pull images from the public registry.

anthonygreen commented 9 years ago

I am experiencing this exact same issue with Docker 1.8.1 on Ubuntu 14.04.

ghost commented 9 years ago

Seeing this with docker 1.8.1 and jasonwyatt/nginx-loadbalancer:latest

andrefreitas commented 9 years ago

Having the same problem with docker 1.8.1 pulling rancher/server or ubuntu on OS X with docker machine.

cpuguy83 commented 9 years ago

1.8.x has a known issue which is fixed in the 1.8.2-rc, which you can download from here: https://github.com/docker/docker/releases/tag/v1.8.2-rc1

It is likely that most issues related to this are now resolved with the changes in 1.8.2, but specifically 1.8.0-1.8.1 have a known regression with this.

whiteadam commented 9 years ago

Landed here to give a +1 only 2 mins after @cpuguy83 clarified the issue.

I got it on tomcat:7 just now (c0ed0d490813) with docker v1.8.1

FredericHeem commented 9 years ago

Yet another random dude giving a +1 v1.8.1 on ubuntu 14.04 with a private docker registry

thaJeztah commented 9 years ago

For those having this issue; it would help to test if 1.8.2-rc1 resolves this for you; see https://github.com/docker/docker/issues/12823#issuecomment-137729547. We're aware of this issue and it should be resolved, so no need to +1 here.

jbaumgartner-ndi commented 9 years ago

I was having this issue repeatedly. I just tested with 1.8.2-rc1 and the issue has not presented itself. Thanks a lot. I hope this release makes it to get.docker.com quickly.

FredericHeem commented 9 years ago

Unfortunately, same problem with 1.8.2-rc

Kindrat commented 9 years ago

Yep, same in 1.8.2-rc <none> <none> f1b10cd84249 4 months ago 0 B

cpuguy83 commented 9 years ago

@FredericHeem @Kindrat Did you upgrade both the daemon and the client?

Kindrat commented 9 years ago

Yes. We spin aws node with docker daemon, connect to it using docker-java client, pull image from private repository. All nodes are of latest versions. Image is pretty big and it causes read timeout on java client. When we check aws node - it has just one "empty" image (0 B). Pull never ends.

rflynn commented 9 years ago

We've run into this issue twice in the past week on 1.8.1

cpuguy83 commented 9 years ago

@rflynn 1.8.2 is now released which should fix the issue for most cases.

rfc1459 commented 9 years ago

I'm seeing this issue with docker 1.8.2 and the ZFS graph driver when trying to pull the postgres:9.4 image.

The interesting thing is the last layer:

506c40f60539: Download complete 
506c40f60539: Layer already being pulled by another client. Waiting.

Diag info:

# docker info
Containers: 0
Images: 35
Storage Driver: zfs
 Zpool: storage
 Zpool Health: ONLINE
 Parent Dataset: storage/docker
 Space Used By Parent: 2058641408
 Space Available: 262142959616
 Parent Quota: no
 Compression: lz4
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.19.0-28-generic
Operating System: Ubuntu 14.04.3 LTS
CPUs: 4
Total Memory: 7.494 GiB
Name: <redacted>
ID: ALCA:HDVI:R66Q:T5QC:NZXQ:UEHY:BSRN:AENM:QJEH:V7MP:7OVB:LECG
# docker version
Client:
 Version:      1.8.2
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   0a8c2e3
 Built:        Thu Sep 10 19:19:00 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.2
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   0a8c2e3
 Built:        Thu Sep 10 19:19:00 UTC 2015
 OS/Arch:      linux/amd64
Kindrat commented 9 years ago

We've fixed this issue pulling images consequentially instead of parallel requests. So, it's worth to check thread management logic.

rfc1459 commented 9 years ago

Running docker in debug mode shows that the root cause of my pull issue is actually #16258.

Still, instead of aborting the pull altogether the daemon tries to fall back to v1 and subsequently deadlocks because the v2 layer fetch has not yet been aborted:

DEBU[0011] Error trying v2 registry: error creating zfs mount of storage/docker/d3a1f33e8a5a513092f01bb7eb1c2abf4d711e5105390a3fe1ae2248cfde1391 to /var/lib/docker/zfs/graph/d3a1f33e8a5a513092f01bb7eb1c2abf4d711e5105390a3fe1ae2248cfde1391: invalid argument 
DEBU[0011] Trying to pull ubuntu from https://index.docker.io v1