docker / for-linux

Docker Engine for Linux
https://docs.docker.com/engine/installation/
757 stars 86 forks source link

Spawning large amount of docker instances fails with pthread_create failed: Resource temporarily unavailable #343

Open gz opened 6 years ago

gz commented 6 years ago

We're trying to spawn a large amount of small docker containers on our server machines (the memory footprint is around 10 MiB). However, when doing so, docker crashes after spawning approx. 400 instances with the following error message:

runtime/cgo: pthread_create failed: Resource temporarily unavailable

We were unable to pinpoint the exact resource limit for this. We still have more than enough memory available in the system (~100 GiB) and the typical limits we checked are set high enough i.e.:

A different machine with a more recent kernel version 4.15 allowed us to spawn ~4k containers but failed with the same error long before running out of memory.

The same issue is discussed here but the proposed fixes did not resolve the problem for us: https://github.com/moby/moby/issues/9868

I realize this may not directly be a docker problem but is it possible to tell what other limit I'm encountering that prevents from spawning more containers?

Output of docker version:

Containers: 2554
Running: 355
Paused: 0
Stopped: 2199
Images: 9
Server Version: 18.03.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
apparmor
seccomp
 Profile: default
Kernel Version: 4.4.98
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 56
Total Memory: 187.6GiB
Name: sc2-hs2-b1628
ID: HIPZ:2ZIW:SQCM:X2BD:2Q4S:532B:IZOX:T6Z5:TA7S:YHC5:IU7G:7GWA
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

Output of docker info:

Client:
Version:      18.03.1-ce
API version:  1.37
Go version:   go1.9.5
Git commit:   9ee9f40
Built:        Thu Apr 26 07:17:20 2018
OS/Arch:      linux/amd64
Experimental: false
Orchestrator: swarm

Server:
Engine:
 Version:      18.03.1-ce
 API version:  1.37 (minimum version 1.12)
 Go version:   go1.9.5
 Git commit:   9ee9f40
 Built:        Thu Apr 26 07:15:30 2018
 OS/Arch:      linux/amd64
 Experimental: false

Additional environment details (AWS, VirtualBox, physical, etc.)

runtime/cgo: pthread_create failed: Resource temporarily unavailable
runtime/cgo: pthread_create failed: Resource temporarily unavailable
SIGABRT: abort
PC=0x7f065d934428 m=0 sigcode=18446744073709551610
goroutine 0 [idle]:
goroutine 1 [running, locked to thread]:
runtime.systemstack_switch()
    /usr/local/go/src/runtime/asm_amd64.s:298 fp=0xc4205c6f98 sp=0xc4205c6f90 pc=0x45a260
runtime.gcStart(0x0, 0x1, 0x0, 0x0)
    /usr/local/go/src/runtime/mgc.go:1319 +0x2c1 fp=0xc4205c6fb8 sp=0xc4205c6f98 pc=0x41c4c1
runtime.mallocgc(0x240, 0x1724480, 0x4c0201, 0x198)
    /usr/local/go/src/runtime/malloc.go:804 +0x3f5 fp=0xc4205c7060 sp=0xc4205c6fb8 pc=0x414815
runtime.newobject(0x1724480, 0x4f197651)
    /usr/local/go/src/runtime/malloc.go:840 +0x38 fp=0xc4205c7090 sp=0xc4205c7060 pc=0x414e58
runtime.mapassign(0x1610200, 0xc420599140, 0xc4205733a0, 0x1567ca0)
    /usr/local/go/src/runtime/hashmap.go:617 +0x443 fp=0xc4205c7128 sp=0xc4205c7090 pc=0x40c863
reflect.mapassign(0x1610200, 0xc420599140, 0xc4205733a0, 0xc420253b00)
    /usr/local/go/src/runtime/hashmap.go:1228 +0x3f fp=0xc4205c7158 sp=0xc4205c7128 pc=0x40eb6f
reflect.Value.SetMapIndex(0x1610200, 0xc4201db3a0, 0x195, 0x15999e0, 0xc4205733a0, 0x98, 0x1724480, 0xc420253b00, 0x199)
    /usr/local/go/src/reflect/value.go:1499 +0x1f6 fp=0xc4205c71c8 sp=0xc4205c7158 pc=0x4bffe6
encoding/json.(*decodeState).object(0xc42059be60, 0x1610200, 0xc4201db3a0, 0x195)
    /usr/local/go/src/encoding/json/decode.go:773 +0xa94 fp=0xc4205c7418 sp=0xc4205c71c8 pc=0x7199f4
encoding/json.(*decodeState).value(0xc42059be60, 0x1610200, 0xc4201db3a0, 0x195)
    /usr/local/go/src/encoding/json/decode.go:405 +0x2e4 fp=0xc4205c7498 sp=0xc4205c7418 pc=0x717c34
encoding/json.(*decodeState).object(0xc42059be60, 0x1534600, 0xc4201db208, 0x16)
    /usr/local/go/src/encoding/json/decode.go:736 +0x1284 fp=0xc4205c76e8 sp=0xc4205c7498 pc=0x71a1e4
encoding/json.(*decodeState).value(0xc42059be60, 0x1534600, 0xc4201db208, 0x16)
    /usr/local/go/src/encoding/json/decode.go:405 +0x2e4 fp=0xc4205c7768 sp=0xc4205c76e8 pc=0x717c34
encoding/json.(*decodeState).unmarshal(0xc42059be60, 0x1534600, 0xc4201db208, 0x0, 0x0)
    /usr/local/go/src/encoding/json/decode.go:187 +0x20e fp=0xc4205c77e0 sp=0xc4205c7768 pc=0x7170be
encoding/json.Unmarshal(0xc42059c000, 0x1116, 0x1e00, 0x1534600, 0xc4201db208, 0xc4205c7878, 0x727ddd)
    /usr/local/go/src/encoding/json/decode.go:107 +0x148 fp=0xc4205c7828 sp=0xc4205c77e0 pc=0x716a88
github.com/docker/cli/vendor/github.com/go-openapi/spec.(*Schema).UnmarshalJSON(0xc4201dad80, 0xc42059c000, 0x1116, 0x1e00, 0x404c00, 0x7f065e276160)
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/schema.go:586 +0x89 fp=0xc4205c7920 sp=0xc4205c7828 pc=0xd540f9
encoding/json.(*decodeState).object(0xc42059bd40, 0x17d9f00, 0xc4201dad80, 0x16)
    /usr/local/go/src/encoding/json/decode.go:601 +0x1b77 fp=0xc4205c7b70 sp=0xc4205c7920 pc=0x71aad7
encoding/json.(*decodeState).value(0xc42059bd40, 0x17d9f00, 0xc4201dad80, 0x16)
    /usr/local/go/src/encoding/json/decode.go:405 +0x2e4 fp=0xc4205c7bf0 sp=0xc4205c7b70 pc=0x717c34
encoding/json.(*decodeState).unmarshal(0xc42059bd40, 0x17d9f00, 0xc4201dad80, 0x0, 0x0)
    /usr/local/go/src/encoding/json/decode.go:187 +0x20e fp=0xc4205c7c68 sp=0xc4205c7bf0 pc=0x7170be
encoding/json.Unmarshal(0xc42059c000, 0x1117, 0x1e00, 0x17d9f00, 0xc4201dad80, 0x0, 0x0)
    /usr/local/go/src/encoding/json/decode.go:107 +0x148 fp=0xc4205c7cb0 sp=0xc4205c7c68 pc=0x716a88
github.com/docker/cli/vendor/github.com/go-openapi/spec.JSONSchemaDraft04(0xc4205723b0, 0x1, 0x1)
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/spec.go:51 +0xb8 fp=0xc4205c7d18 sp=0xc4205c7cb0 pc=0xd54898
github.com/docker/cli/vendor/github.com/go-openapi/spec.MustLoadJSONSchemaDraft04(0x18015f0)
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/spec.go:36 +0x22 fp=0xc4205c7d40 sp=0xc4205c7d18 pc=0xd54792
github.com/docker/cli/vendor/github.com/go-openapi/spec.init()
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/spec.go:30 +0x518 fp=0xc4205c7db0 sp=0xc4205c7d40 pc=0xd56448
github.com/docker/cli/vendor/k8s.io/apimachinery/pkg/api/resource.init()
    <autogenerated>:1 +0x7d fp=0xc4205c7df8 sp=0xc4205c7db0 pc=0xd9561d
github.com/docker/cli/vendor/k8s.io/apimachinery/pkg/apis/meta/v1.init()
    <autogenerated>:1 +0x76 fp=0xc4205c7f18 sp=0xc4205c7df8 pc=0xe27206
github.com/docker/cli/kubernetes/compose/v1beta1.init()
    <autogenerated>:1 +0x4b fp=0xc4205c7f40 sp=0xc4205c7f18 pc=0xe3a28b
github.com/docker/cli/cli/command/stack/kubernetes.init()
    <autogenerated>:1 +0x4d fp=0xc4205c7f50 sp=0xc4205c7f40 pc=0x143a21d
github.com/docker/cli/cli/command/stack.init()
    <autogenerated>:1 +0x53 fp=0xc4205c7f60 sp=0xc4205c7f50 pc=0x14512c3
github.com/docker/cli/cli/command/commands.init()
    <autogenerated>:1 +0x89 fp=0xc4205c7f70 sp=0xc4205c7f60 pc=0x14813d9
main.init()
    <autogenerated>:1 +0x66 fp=0xc4205c7f80 sp=0xc4205c7f70 pc=0x14852f6
runtime.main()
    /usr/local/go/src/runtime/proc.go:183 +0x1de fp=0xc4205c7fe0 sp=0xc4205c7f80 pc=0x42fbee
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc4205c7fe8 sp=0xc4205c7fe0 pc=0x45ce91
goroutine 5 [syscall]:
os/signal.signal_recv(0x0)
    /usr/local/go/src/runtime/sigqueue.go:131 +0xa6
os/signal.loop()
    /usr/local/go/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.0
    /usr/local/go/src/os/signal/signal_unix.go:28 +0x41
rax    0x0
rbx    0x7f065dcc4700
rcx    0x7f065d934428
rdx    0x6
rdi    0xedda
rsi    0xedda
rbp    0x19dc11e
rsp    0x7ffe17c55a48
r8     0x7f065dcc5770
r9     0x7f065e306700
r10    0x8
r11    0x202
r12    0x2fbb050
r13    0xf1
r14    0x11
r15    0x0
rip    0x7f065d934428
rflags 0x202
cs     0x33
fs     0x0
gs     0x0
runtime/cgo: pthread_create failed: Resource temporarily unavailable
runtime/cgo: pthread_create failed: Resource temporarily unavailable
SIGABRT: abort
PC=0x7f56237fd428 m=4 sigcode=18446744073709551610
goroutine 0 [idle]:
goroutine 1 [GC assist wait, locked to thread]:
reflect.mapassign(0x1636ea0, 0xc420430630, 0xc42053d830, 0xc420442480)
    /usr/local/go/src/runtime/hashmap.go:1228 +0x3f
reflect.Value.SetMapIndex(0x1636ea0, 0xc4203a2d08, 0x195, 0x15999e0, 0xc42053d830, 0x98, 0x1724480, 0xc420442480, 0x199)
    /usr/local/go/src/reflect/value.go:1499 +0x1f6
encoding/json.(*decodeState).object(0xc420383e60, 0x1636ea0, 0xc4203a2d08, 0x195)
    /usr/local/go/src/encoding/json/decode.go:773 +0xa94
encoding/json.(*decodeState).value(0xc420383e60, 0x1636ea0, 0xc4203a2d08, 0x195)
    /usr/local/go/src/encoding/json/decode.go:405 +0x2e4
encoding/json.(*decodeState).object(0xc420383e60, 0x1534600, 0xc4203a2b48, 0x16)
    /usr/local/go/src/encoding/json/decode.go:736 +0x1284
encoding/json.(*decodeState).value(0xc420383e60, 0x1534600, 0xc4203a2b48, 0x16)
    /usr/local/go/src/encoding/json/decode.go:405 +0x2e4
encoding/json.(*decodeState).unmarshal(0xc420383e60, 0x1534600, 0xc4203a2b48, 0x0, 0x0)
    /usr/local/go/src/encoding/json/decode.go:187 +0x20e
encoding/json.Unmarshal(0xc42040a000, 0x9c54, 0xfe00, 0x1534600, 0xc4203a2b48, 0xc4201dd820, 0x727ddd)
    /usr/local/go/src/encoding/json/decode.go:107 +0x148
github.com/docker/cli/vendor/github.com/go-openapi/spec.(*Schema).UnmarshalJSON(0xc4203a2900, 0xc42040a000, 0x9c54, 0xfe00, 0x404c00, 0x7f562413f160)
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/schema.go:586 +0x89
encoding/json.(*decodeState).object(0xc420383d40, 0x17d9f00, 0xc4203a2900, 0x16)
    /usr/local/go/src/encoding/json/decode.go:601 +0x1b77
encoding/json.(*decodeState).value(0xc420383d40, 0x17d9f00, 0xc4203a2900, 0x16)
    /usr/local/go/src/encoding/json/decode.go:405 +0x2e4
encoding/json.(*decodeState).unmarshal(0xc420383d40, 0x17d9f00, 0xc4203a2900, 0x0, 0x0)
    /usr/local/go/src/encoding/json/decode.go:187 +0x20e
encoding/json.Unmarshal(0xc42040a000, 0x9c54, 0xfe00, 0x17d9f00, 0xc4203a2900, 0x0, 0x0)
    /usr/local/go/src/encoding/json/decode.go:107 +0x148
github.com/docker/cli/vendor/github.com/go-openapi/spec.Swagger20Schema(0xc4201ddd50, 0x2ea6c043, 0x63e713280b395a96)
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/spec.go:75 +0xb8
github.com/docker/cli/vendor/github.com/go-openapi/spec.MustLoadSwagger20Schema(0x63e713280b395a96)
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/spec.go:59 +0x22
github.com/docker/cli/vendor/github.com/go-openapi/spec.initResolutionCache(0x160e220, 0xc4203b54d0)
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/expander.go:44 +0x26
github.com/docker/cli/vendor/github.com/go-openapi/spec.init()
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/expander.go:40 +0x411
github.com/docker/cli/vendor/k8s.io/apimachinery/pkg/api/resource.init()
    <autogenerated>:1 +0x7d
github.com/docker/cli/vendor/k8s.io/apimachinery/pkg/apis/meta/v1.init()
    <autogenerated>:1 +0x76
github.com/docker/cli/kubernetes/compose/v1beta1.init()
    <autogenerated>:1 +0x4b
github.com/docker/cli/cli/command/stack/kubernetes.init()
    <autogenerated>:1 +0x4d
github.com/docker/cli/cli/command/stack.init()
    <autogenerated>:1 +0x53
github.com/docker/cli/cli/command/commands.init()
    <autogenerated>:1 +0x89
main.init()
    <autogenerated>:1 +0x66
goroutine 5 [syscall]:
os/signal.signal_recv(0x0)
    /usr/local/go/src/runtime/sigqueue.go:131 +0xa6
os/signal.loop()
    /usr/local/go/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.0
    /usr/local/go/src/os/signal/signal_unix.go:28 +0x41
rax    0x0
rbx    0x7f5623b8d700
rcx    0x7f56237fd428
rdx    0x6
rdi    0xee64
rsi    0xee73
rbp    0x19dc11e
rsp    0x7f56225c0988
r8     0x7f5623b8e770
r9     0x7f56225c1700
r10    0x8
r11    0x202
r12    0x7f56140008c0
r13    0xf1
r14    0x11
r15    0x0
rip    0x7f56237fd428
rflags 0x202
cs     0x33
fs     0x0
gs     0x0
SIGABRT: abort
PC=0x7fa6e6048428 m=7 sigcode=18446744073709551610
goroutine 0 [idle]:
goroutine 1 [GC assist wait, locked to thread]:
github.com/docker/cli/vendor/github.com/go-openapi/swag.(*NameProvider).GetJSONNames(0xc420381350, 0x17d9f00, 0xc420634240, 0x7, 0xc42054dc60, 0x0)
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/swag/json.go:227 +0x1ef
github.com/docker/cli/vendor/github.com/go-openapi/spec.(*Schema).UnmarshalJSON(0xc420634240, 0xc420412e81, 0x3a, 0x6f7f, 0x0, 0x7fa6e698a160)
    /go/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/schema.go:606 +0x2a6
encoding/json.(*decodeState).object(0xc420630480, 0x1724480, 0xc420634240, 0x199)
    /usr/local/go/src/encoding/json/decode.go:601 +0x1b77
encoding/json.(*decodeState).value(0xc420630480, 0x1724480, 0xc420634240, 0x199)
    /usr/local/go/src/encoding/json/decode.go:405 +0x2e4
encoding/json.(*decodeState).object(0xc420630480, 0x1610200, 0xc4206341a0, 0x195)
    /usr/local/go/src/encoding/json/decode.go:736 +0x1284
encoding/json.(*decodeState).value(0xc420630480, 0x1610200, 0xc4206341a0, 0x195)
    /usr/local/go/src/encoding/json/decode.go:405 +0x2e4
encoding/json.(*decodeState).objectSIGABRT: abort(
0xc420630480PC=, 0x7fa1644b74280x1534600 m=, 00xc420634008 sigcode=, 184467440737095516100x16
)
    goroutine /usr/local/go/src/encoding/json/decode.go0: [736idle +]:
0x1284
encoding/json.(*decodeState).value
(goroutine 0xc4206304801,  [0x1534600running, , locked to thread0xc420634008]:
, 0x16)
    runtime.systemstack_switch/usr/local/go/src/encoding/json/decode.go(:)
405  +/usr/local/go/src/runtime/asm_amd64.s0x2e4:
298encoding/json.(*decodeState).unmarshal fp=(0xc4205041580xc420630480 sp=, 0xc4205041500x1534600 pc=, 0x45a2600xc420634008
, 0x0, runtime.gcStart0x0()
0x0 , /usr/local/go/src/encoding/json/decode.go0x1:, 1870x0 +, 0x20e0x0
)
encoding/json.Unmarshal (/usr/local/go/src/runtime/mgc.go0xc420412cce:, 13190x363 +, 0x2c1...
edolce62 commented 6 years ago

This issue depends from the new systemd behavior, look at this issue

https://success.docker.com/article/how-to-reserve-resource-temporarily-unavailable-errors-due-to-tasksmax-setting

gz commented 6 years ago

We were aware of the TasksMax issue at the time and set it to infinity. However, it didn't resolve the issue.

chrismaes87 commented 5 years ago

might be linked to this: https://github.com/docker/for-linux/issues/654

sirlatrom commented 3 years ago

Updated link: https://success.mirantis.com/article/how-to-reserve-resource-temporarily-unavailable-errors-due-to-tasksmax-setting