moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.7k stars 18.66k forks source link

Docker run command is taking around 10 min to return container id only on RHEL9.1 #46087

Open rahulilla opened 1 year ago

rahulilla commented 1 year ago

Description

Here are our system info Platform RHEL9.1 Docker version:

Client: Docker Engine - Community
 Version:           23.0.1
 API version:       1.42
 Go version:        go1.19.5
 Git commit:        a5ee5b1
 Built:             Thu Feb  9 19:49:35 2023
 OS/Arch:           linux/amd64
 Context:           default

This issue is happening Intermittantly

Whenever we try to run docker run commands for images which are already downloaded/pulled the run command just hangs and returns containerid only after 8-10 min.

when i turned on the debug logging for docker daemon it seems there is a 7 min difference between these two log lines

Jul 26 17:20:11  dockerd[2213991]: time="2023-07-26T17:20:11.278168573Z" level=debug msg="form data: {\"AttachStderr\":false
Jul 26 17:27:02 dockerd[2213991]: time="2023-07-26T17:27:02.286866485Z" level=debug msg="container mounted via layerStore: &{/home/docker/overlay2/***/merged 0x55620155de60 0x55620155de60}" container=**

Reproduce

docker run -d

Expected behavior

docker run should return container id immediately

docker version

[root@atm-10-168-1-110-gcp ~]# docker version
Client: Docker Engine - Community
 Version:           23.0.1
 API version:       1.42
 Go version:        go1.19.5
 Git commit:        a5ee5b1
 Built:             Thu Feb  9 19:49:35 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.1
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.5
  Git commit:       bc3805a
  Built:            Thu Feb  9 19:46:32 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.18
  GitCommit:        2456e983eb9e37e47538f59ea18f2043c9a73640
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0

docker info

[root@atm-10-168-1-110-gcp ~]# docker info
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 3
  Running: 2
  Paused: 0
  Stopped: 1
 Images: 30
 Server Version: 23.0.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2456e983eb9e37e47538f59ea18f2043c9a73640
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.14.0-162.6.1.el9_1.x86_64
 Operating System: Red Hat Enterprise Linux 9.1 (Plow)
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 125.3GiB
 Name: *****
 ID: fbe5c1ff-0218-4d7c-9a3a-64aa9c2d7ec7
 Docker Root Dir: /home/docker
 Debug Mode: true
  File Descriptors: 36
  Goroutines: 43
  System Time: 2023-07-27T10:53:09.93405536Z
  EventsListeners: 1
 HTTP Proxy: *
 HTTPS Proxy: *
 No Proxy: sfdc.net,salesforce.com,gcr.io,googleapis.com,localhost
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Default Address Pools:
   Base: 172.16.0.0/22, Size: 26

Additional Info

Docker run command hangs and returns container id only after 10 min

neersighted commented 1 year ago

You're on a pretty old containerd version; can you try with 1.6.21?

rahulilla commented 1 year ago

@neersighted Thanks for response i tried with containerd version 1.6.21 and docker ce version - the behaviour is still same - taking 7 min to return the container id - The delay is during the mounting of container to layer store - Its a big image of oracle (9gb) can that cause this delay ? Client: Docker Engine - Community Version: 24.0.2 API version: 1.43 Go version: go1.20.4 Git commit: cb74dfc Built: Thu May 25 21:53:24 2023 OS/Arch: linux/amd64 Context: default

Server: Docker Engine - Community Engine: Version: 24.0.2 API version: 1.43 (minimum version 1.12) Go version: go1.20.4 Git commit: 659604f Built: Thu May 25 21:51:50 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.21 GitCommit: 3dce8eb055cbb6872793272b4f20ed16117344f8

rahulilla commented 1 year ago

Observation: when we tested the same image on centos7 - we are not seeing this time delay for docker run command Its happening only on RHEL9.1

bsousaa commented 1 year ago

Can you dump a stack trace? More details at https://docs.docker.com/config/daemon/logs/#force-a-stack-trace-to-be-logged

rahulilla commented 1 year ago

goroutine-stacks-2023-07-28T161356Z.log.zip @bsousaa

thaJeztah commented 1 year ago

Last chunk of that dump shows something around unmounting;

goroutine 19534 [syscall, 3 minutes]:
syscall.Syscall(0x62?, 0xc000d8b400?, 0xc000c96070?, 0x63?)
    /usr/local/go/src/syscall/syscall_linux.go:69 +0x27
github.com/docker/docker/vendor/golang.org/x/sys/unix.Unmount({0xc000c96070?, 0xc000c96070?}, 0x45?)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/golang.org/x/sys/unix/zsyscall_linux.go:1701 +0x7d
github.com/docker/docker/daemon/graphdriver/overlay2.(*Driver).Put(0xc0009a8380, {0xc0013fd5e0, 0x45})
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/graphdriver/overlay2/overlay.go:633 +0x2eb
github.com/docker/docker/layer.(*layerStore).initMount(0xc0000c6280, {0xc0010975c0?, 0xc0013fd590?}, {0xc00023dcc0, 0x40}, {0x0, 0x0}, 0xc000cc9440, 0x0)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/layer/layer_store.go:686 +0x198
github.com/docker/docker/layer.(*layerStore).CreateRWLayer(0xc0000c6280, {0xc001097180, 0x40}, {0xc0013fd590, 0x47}, 0x556669b9ca20?)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/layer/layer_store.go:536 +0x4a9
github.com/docker/docker/daemon/images.(*ImageService).CreateLayer(0xc000c9c2c0, 0xc00061a500, 0xc000cc9440)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/images/service.go:135 +0x13e
github.com/docker/docker/daemon.(*Daemon).create(0xc0006286c0, {0x55666a887d10, 0xc000ad2bd0}, {{{0xc001096263, 0x9}, 0xc000618780, 0xc000dccd80, 0xc000014238, 0x0, 0x0}, ...})
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/create.go:191 +0x5f0
github.com/docker/docker/daemon.(*Daemon).containerCreate(0xc0006286c0, {0x55666a887d10, 0xc000ad2bd0}, {{{0xc001096263, 0x9}, 0xc000618780, 0xc000dccd80, 0xc000014238, 0x0, 0x0}, ...})
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/create.go:102 +0x685
github.com/docker/docker/daemon.(*Daemon).ContainerCreate(0xc0012f2460?, {0x55666a887d10?, 0xc000ad2bd0?}, {{0xc001096263, 0x9}, 0xc000618780, 0xc000dccd80, 0xc000014238, 0x0, 0x0})
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/create.go:45 +0xe5
github.com/docker/docker/api/server/router/container.(*containerRouter).postContainersCreate(0xc0010ad180, {0x55666a887d10, 0xc000ad2bd0}, {0x55666a8864d0, 0xc0006362a0}, 0xc000136e00, 0xc0005e0580?)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/api/server/router/container/container_routes.go:591 +0x8d6
github.com/docker/docker/api/server/middleware.ExperimentalMiddleware.WrapHandler.func1({0x55666a887d10, 0xc000ad2bd0}, {0x55666a8864d0?, 0xc0006362a0?}, 0x55666a370840?, 0xc0012f34b0?)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/api/server/middleware/experimental.go:26 +0x15b
github.com/docker/docker/api/server/middleware.VersionMiddleware.WrapHandler.func1({0x55666a887d10, 0xc000ad2570}, {0x55666a8864d0, 0xc0006362a0}, 0xc000128040?, 0xc00113e208?)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/api/server/middleware/version.go:62 +0x4d7
github.com/docker/docker/pkg/authorization.(*Middleware).WrapHandler.func1({0x55666a887d10, 0xc000ad2570}, {0x55666a8864d0?, 0xc0006362a0?}, 0xc000136e00, 0x1?)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/pkg/authorization/middleware.go:59 +0x649
github.com/docker/docker/api/server/middleware.DebugRequestMiddleware.func1({0x55666a887d10, 0xc000ad2570}, {0x55666a8864d0, 0xc0006362a0}, 0xc000136e00, 0xc0012f2270?)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/api/server/middleware/debug.go:53 +0x5c5
github.com/docker/docker/api/server.(*Server).makeHTTPHandler.func1({0x55666a8864d0, 0xc0006362a0}, 0xc000136d00)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/api/server/server.go:53 +0x1ce
net/http.HandlerFunc.ServeHTTP(0xc000136b00?, {0x55666a8864d0?, 0xc0006362a0?}, 0xc000d979e8?)
    /usr/local/go/src/net/http/server.go:2122 +0x2f
github.com/docker/docker/vendor/github.com/gorilla/mux.(*Router).ServeHTTP(0xc000001980, {0x55666a8864d0, 0xc0006362a0}, 0xc000136a00)
    /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/gorilla/mux/mux.go:210 +0x1cf
net/http.serverHandler.ServeHTTP({0x55666a876ba0?}, {0x55666a8864d0, 0xc0006362a0}, 0xc000136a00)
    /usr/local/go/src/net/http/server.go:2936 +0x316
net/http.(*conn).serve(0xc000d00120, {0x55666a887d10, 0xc000a2d3b0})
    /usr/local/go/src/net/http/server.go:1995 +0x612
created by net/http.(*Server).Serve
    /usr/local/go/src/net/http/server.go:3089 +0x5ed

Following the last few bits of that (note: dump above is in reverse order);

https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/daemon/create.go#L45 https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/daemon/create.go#L102 https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/daemon/create.go#L191 https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/daemon/images/service.go#L135 https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/layer/layer_store.go#L536 https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/layer/layer_store.go#L686 https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/daemon/graphdriver/overlay2/overlay.go#L633 https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/vendor/golang.org/x/sys/unix/zsyscall_linux.go#L1701 https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/vendor/golang.org/x/sys/unix/syscall_linux.go#L69

That last line is part of this function: https://github.com/moby/moby/blob/659604f9ee60f147020bdd444b26e4b5c636dc28/vendor/golang.org/x/sys/unix/syscall_linux.go#L65-L75

rahulilla commented 1 year ago

@thaJeztah what is it trying to unmount ? is it unable to mount the container ?

rahulilla commented 1 year ago

hi @thaJeztah could you please clarify ?