swarm runtime: program exceeds 10000-thread limit cause reelected leader very frequently

shenshouer commented 8 years ago

When I run docker-compose scale -t 30 web=1000 , report runtime: program exceeds 10000-thread limit very frequently

3 swarm master and 3 consul master
15 swarm node

version of swarm: swarm version 1.1.2 (f947993)

OS:

NAME=CoreOS
ID=coreos
VERSION=835.13.0
VERSION_ID=835.13.0
BUILD_ID=
PRETTY_NAME="CoreOS 835.13.0"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"

docker

Client:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   cedd534-dirty
 Built:        Thu Feb 18 16:25:16 UTC 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   cedd534-dirty
 Built:        Thu Feb 18 16:25:16 UTC 2016
 OS/Arch:      linux/amd64

Feb 23 15:24:08 worker-xxx-120 swarm[17548]: time="2016-02-23T15:24:08+08:00" level=info msg="Leader Election: Cluster leadership acquired"
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime: program exceeds 10000-thread limit
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: fatal error: thread exhaustion
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime stack:
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.throw(0xfb32a0, 0x11)
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/panic.go:527 +0x90
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.checkmcount()
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/proc1.go:93 +0x94
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.mcommoninit(0xc847fc4900)
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/proc1.go:113 +0xf2
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.allocm(0xc820020a00, 0x10ce7a0, 0x40dc74)
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/proc1.go:868 +0x9c
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.newm(0x10ce7a0, 0xc820020a00)
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/proc1.go:1099 +0x2f
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.startm(0xc820020a00, 0x100000001)
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/proc1.go:1186 +0xef
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.wakep()
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/proc1.go:1258 +0x59
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.resetspinning()
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/proc1.go:1559 +0xb0
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.schedule()
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/proc1.go:1648 +0x27e
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.goexit0(0xc847aee180)
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: /usr/local/go/src/runtime/proc1.go:1773 +0x1a2
Feb 23 15:25:26 worker-xxx-120 swarm[17548]: runtime.mcall(0xc823423340)

nishanttotla commented 8 years ago

This might be related to #1838.

Ping @dongluochen

vieux commented 8 years ago

@shenshouer can you try with master ? dockerswarm/swarm:master

dongluochen commented 8 years ago

@shenshouer Thanks for reporting the issue! Do you see a lot of threads stuck at syscall? PR #1843 might address the issue. If you have swarm repo you can build and test it, or it'll be release soon.

goroutine 32255 [syscall, 26 minutes]:
syscall.Syscall(0x0, 0x10e6, 0xc83a34c000, 0x1000, 0x42993d, 0xc822102a80, 0x6424e8)
    /usr/local/go/src/syscall/asm_linux_amd64.s:18 +0x5
syscall.read(0x10e6, 0xc83a34c000, 0x1000, 0x1000, 0x72, 0x0, 0x0)
    /usr/local/go/src/syscall/zsyscall_linux_amd64.go:783 +0x5f
syscall.Read(0x10e6, 0xc83a34c000, 0x1000, 0x1000, 0x88ecd2dfc509e201, 0x0, 0x0)
    /usr/local/go/src/syscall/syscall_unix.go:160 +0x4d
net.(*netFD).Read(0xc83968f960, 0xc83a34c000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/fd_unix.go:228 +0x18b
net.(*conn).Read(0xc82fdac528, 0xc83a34c000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/net.go:172 +0xe4
net/http.noteEOFReader.Read(0x7f282a4ffc88, 0xc82fdac528, 0xc839559bd8, 0xc83a34c000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/http/transport.go:1370 +0x67
net/http.(*noteEOFReader).Read(0xc8224cf060, 0xc83a34c000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    <autogenerated>:126 +0xd0
bufio.(*Reader).fill(0xc83a2dac00)
    /usr/local/go/src/bufio/bufio.go:97 +0x1e9
bufio.(*Reader).Peek(0xc83a2dac00, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0)
    /usr/local/go/src/bufio/bufio.go:132 +0xcc
net/http.(*persistConn).readLoop(0xc839559b80)
    /usr/local/go/src/net/http/transport.go:876 +0xf7
created by net/http.(*Transport).dialConn
    /usr/local/go/src/net/http/transport.go:685 +0xc78

shenshouer commented 8 years ago

@vieux @dongluochen The swarm I used with binary file which I build with the master branch of source code.

everything is ok when I use docker command to run containers , command like:

for i in {1..9000};do docker -H :4000 run -d --name=web_$i nginx;done

So the problem is docker-compose ?

dongluochen commented 8 years ago

@shenshouer Would you mind sharing your compose file?

shenshouer commented 8 years ago

@dongluochen version of compose I used :

docker-compose version: 1.5.1
docker-py version: 1.5.0
CPython version: 2.7.9
OpenSSL version: OpenSSL 1.0.1e 11 Feb 2013

dongluochen commented 8 years ago

@shenshouer We released [swarm 1.1.3-rc1[(https://github.com/docker/swarm/releases/tag/v1.1.3-rc1) today. It'd be great if you can run a test to see if it fixes this issue.

shenshouer commented 8 years ago

ok, I will test it one moment

shenshouer commented 8 years ago

my testing environment:

3 swarm master && consul node
15 swarm node && consul node

OS:

worker-xxx-120 yooadmin # cat /etc/os-release
NAME=CoreOS
ID=coreos
VERSION=962.0.0
VERSION_ID=962.0.0
BUILD_ID=2016-02-18-0540
PRETTY_NAME="CoreOS 962.0.0 (Coeur Rouge)"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"

docker:Docker version 1.10.1, build 88b8b3a docker-compose:docker-compose version: 1.5.1 swarm: swarm version 1.1.3 (da37fb5)

swarm-manager.service:

[Unit]
Description=swarm
After=docker.service consul-server.service
Requires=docker.service consul-server.service
[Service]
Environment=GOMAXPROCS=40
TimeoutStartSec=0
Restart=always
LimitNOFILE=1048576
LimitNPROC=1048576
# ExecStartPre=-/usr/bin/docker rm swarm
ExecStart=/opt/bin/swarm --experimental --log-level debug manage \
  -H :4000 \
  --replication \
  --advertise xxx.xxx:4000 \
  --replication-ttl 20s \
  --engine-refresh-min-interval 2s \
  --engine-refresh-max-interval 10s \
  --heartbeat 10s \
  consul://127.0.0.1:8500
# ExecStop=/usr/bin/docker kill swarm
[Install]
WantedBy=multi-user.target

swarm-node.service:

[Unit]
Description=swarm
After=docker.service consul-server.service
Requires=docker.service consul-server.service
[Service]
Environment=GOMAXPROCS=40
TimeoutStartSec=0
Restart=always
LimitNOFILE=1048576
LimitNPROC=1048576
# ExecStartPre=-/usr/bin/docker rm swarm
ExecStart=/opt/bin/swarm --experimental --log-level debug join \
  --advertise xxx.xxx:2375 \
  --ttl 20s \
  --heartbeat 10s \
  consul://127.0.0.1:8500
# ExecStop=/usr/bin/docker kill swarm
[Install]
WantedBy=multi-user.target

docker-compose.yml:

web:
  image: nginx
  restart: always
  labels:
    com.docker.swarm.reschedule-policies: "[\"on-node-failure\"]"

run container with docker-compose:

DOCKER_HOST=127.0.0.1:4000 /opt/bin/docker-compose up --force-recreate -d

scale container number to 3000

DOCKER_HOST=127.0.0.1:4000 /opt/bin/docker-compose scale web=3000

and I got the error:

ERROR: for 54  Container created but refresh didn't report it back
ERROR: for 58  Container created but refresh didn't report it back
ERROR: for 152  ('Connection aborted.', error(24, 'Too many open files'))
ERROR: Couldn't connect to Docker daemon at http://127.0.0.1:4000 - is it running?

If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.

# get the running containers after error.
worker-xxx-120 yooadmin # DOCKER_HOST=127.0.0.1:4000 /opt/bin/docker-compose ps |wc -l
1390
worker-xxx-120 yooadmin # docker -H :4000 ps |wc -l
1285

and in order to get 3000 running containers , need to continuous execute this command many times. Then I stop the running containers:

DOCKER_HOST=127.0.0.1:4000 /opt/bin/docker-compose stop

ERROR: for yooadmin_web_408  ('Connection aborted.', error(24, 'Too many open files'))
ERROR: Couldn't connect to Docker daemon at http://127.0.0.1:4000 - is it running?

If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.

Very lucky in executing the following commands without error：

DOCKER_HOST=127.0.0.1:4000 /opt/bin/docker-compose rm -f

shenshouer commented 8 years ago

the same with docker-compose version 1.6.2, build 4d72027

anarcher commented 8 years ago

I have the same issue.

swarm:1.1.2
docker 1.10.2

runtime: program exceeds 10000-thread limit
fatal error: thread exhaustion

runtime stack:
runtime.throw(0xfb32a0, 0x11)
        /usr/local/go/src/runtime/panic.go:527 +0x90
runtime.checkmcount()
        /usr/local/go/src/runtime/proc1.go:93 +0x94
runtime.mcommoninit(0xc842684000)
        /usr/local/go/src/runtime/proc1.go:113 +0xf2
runtime.allocm(0xc820016000, 0x0, 0x40dc74)
        /usr/local/go/src/runtime/proc1.go:868 +0x9c
runtime.newm(0x0, 0xc820016000)
        /usr/local/go/src/runtime/proc1.go:1099 +0x2f
runtime.startm(0xc820016000, 0xc820000400)
        /usr/local/go/src/runtime/proc1.go:1186 +0xef
runtime.handoffp(0xc820016000)
        /usr/local/go/src/runtime/proc1.go:1209 +0x2c7
runtime.retake(0x1d84d93ba3f77, 0x0)
        /usr/local/go/src/runtime/proc1.go:3126 +0x1e0
runtime.sysmon()
        /usr/local/go/src/runtime/proc1.go:3053 +0x20f
runtime.mstart1()
        /usr/local/go/src/runtime/proc1.go:715 +0xe8
runtime.mstart()
        /usr/local/go/src/runtime/proc1.go:685 +0x72
goroutine 1 [chan receive, 1550 minutes]:
github.com/docker/swarm/api.(*Server).ListenAndServe(0xc82011a1b0, 0x0, 0x0)
        /go/src/github.com/docker/swarm/api/server.go:117 +0x383
github.com/docker/swarm/cli.manage(0xc8200f87e0)
        /go/src/github.com/docker/swarm/cli/manage.go:332 +0x1c3a
github.com/codegangsta/cli.Command.Run(0xefe130, 0x6, 0xefe0d0, 0x1, 0x0, 0x0, 0x0, 0xf96100, 0x17, 0x0, ...)
        /go/src/github.com/docker/swarm/Godeps/_workspace/src/github.com/codegangsta/cli/command.go:137 +0x107e
github.com/codegangsta/cli.(*App).Run(0xc8200f85a0, 0xc820060000, 0x5, 0x5, 0x0, 0x0)
        /go/src/github.com/docker/swarm/Godeps/_workspace/src/github.com/codegangsta/cli/app.go:176 +0xfff
github.com/docker/swarm/cli.Run()
        /go/src/github.com/docker/swarm/cli/cli.go:65 +0x562
main.main()
        /go/src/github.com/docker/swarm/main.go:13 +0x14

goroutine 21 [chan receive]:
github.com/golang/glog.(*loggingT).flushDaemon(0x13c8ba0)
        /go/src/github.com/docker/swarm/Godeps/_workspace/src/github.com/golang/glog/glog.go:882 +0x67
created by github.com/golang/glog.init.1
        /go/src/github.com/docker/swarm/Godeps/_workspace/src/github.com/golang/glog/glog.go:410 +0x297

dongluochen commented 8 years ago

@shenshouer Thanks for providing the related info!

dongluochen commented 8 years ago

@anarcher What are the top 2 frequent threads in your dump? Can you describe your set up and what workload gets to the crash?

dongluochen commented 8 years ago

@shenshouer I do not get swarm crash from 10000-thread limit by running your application. I find an issue that docker-compose may flood swarm. (There are several minor differences from your setup that I may look into. I'm not running compose from same host, but from external so that I can monitor components separately.) And docker-compose may interrupt its operation from this.

When docker-compose tries to scale web instance to 3000. It generates 3000 network connections to DOCKER_HOST simultaneously. It doesn't kill swarm but the cluster (node and/or swarm manager) is not able to handle such simultaneous requests and some requests fail. I think that's the major reason that you cannot scale to 3000. If you do it sequentially to swarm manager, it shouldn't be a problem.

core@ip-172-19-231-118 ~/crash $ ../docker-compose scale web=3000         
Creating and starting 3000 ... 
Creating and starting 3001 ... 
...

core@ip-172-19-231-118 ~ $ for i in {1..100}; do netstat -an | grep -i 3375 | wc -l; sleep 2; done
0
1
2
2
2
2
2
552
1306
1904
2420
2441
2442
2442
2442
2442
2441
2441
2441
2441
2441
2447
2474
2477
2483
2520
2843
2899
2899
2899
2899
2899
2899
2899
2899
2848
2637
2092
1870
1544
1469
1422
1335
1296
1264
1224
1209
1190
1170
1143
1088
1043
996
939
854
524
336
336
336
336
125
100
99
99
99
98
98
98
98
79
20
19
20
1
1

dongluochen commented 8 years ago

@shenshouer @anarcher So far I think the problem is request spike causing swarm manager and docker node resource starvation. In my test with Swarm 1.1.3-rc1 I did not get to 10000 thread limit when docker-compose stopped first. I think docker-compose should have flow control to not flood swarm. And swarm should implement self protection to handle high load, and protect docker engine when it's close to resource starvation.

Here are swarm logs showing the issue. Swarm manager suffers from too many open concurrent connections.

ERRO[2234] HTTP error: Post http://dong-wily-2:2375/v1.15/containers/create?name=crash_web_5761: dial tcp 172.19.209.201:2375: socket: too many open files. Are you trying to connect to a TLS-enabled daemon without TLS?  status=500
2016/02/27 07:55:15 http: Accept error: accept tcp [::]:3375: accept4: too many open files; retrying in 20ms

Docker daemons experience memory starvation or file handler.

ERRO[2236] HTTP error: 500 Internal Server Error: Cannot start container e587188081c1be349e1ead35a4dc5315bcfbdd8c67d3ef52f07102a7084a410c: [9] System error: fork/exec /proc/self/exe: cannot allocate memory
  status=500
...
ERRO[2238] HTTP error: 500 Internal Server Error: devmapper: Error saving transaction metadata: devmapper: Error creating metadata file: open /var/lib/docker/devicemapper/metadata/.tmp676640282: too many open files
  status=500

anarcher commented 8 years ago

Hi, @dongluochen

I'm sorry about I have no logs about it now. I used to use the Interlock (v0.3.3) with interlock's stat plugin. I suspect the plugin makes many requests, so that this issue appeared.

After turn out the interlock container, the problem does not appear.

Thanks, anarcher

shenshouer commented 8 years ago

@dongluochen Use the docker run command is not wrong, but there will be some error with docker-compose. I will test again next week and will log list.

dongluochen commented 8 years ago

@shenshouer Feel free to re-open if you think it still has problem. Your test results are welcome.

docker-archive / classicswarm

swarm runtime: program exceeds 10000-thread limit cause reelected leader very frequently #1859