upserve / docker-api

A lightweight Ruby client for the Docker Remote API
MIT License
1.09k stars 288 forks source link

Running rspec docker image tests are failing #396

Closed mblaschke closed 7 years ago

mblaschke commented 8 years ago

I'm not sure since when or why our Dockerfile tests are failing (eg. webdevops/apache. It happens since updating to Docker 1.11.0 but also going back to Docker 1.10.3 doesn't help.

When I'm running tests in serial mode they are finishing without issues. In parallel mode (with 1 sec delay) the rspec tests are failing. Each testsuite for each container is running as separate process so it's not a threading issue.

When I'm delaying creating of images eg. for 20 or 30 seconds (10 are not enough) the tests are running without any issues.

Any ideas?

docker ps:

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
9f8c73f21555        e38d08beb382        "/opt/docker/bin/entr"   36 seconds ago      Up 35 seconds       80/tcp, 443/tcp     sick_kalam
90ae84dcbb26        5b8a7943c422        "/opt/docker/bin/entr"   44 seconds ago      Up 42 seconds       80/tcp, 443/tcp     nostalgic_jennings
5f8836694d3f        caebb6295807        "/opt/docker/bin/entr"   47 seconds ago      Up 45 seconds       80/tcp, 443/tcp     reverent_swartz
1b6b79015790        494dcc146b8b        "/opt/docker/bin/entr"   50 seconds ago      Up 49 seconds       80/tcp, 443/tcp     stupefied_blackwell
342f7b1e1aa1        aa1e2ebc5960        "/opt/docker/bin/entr"   53 seconds ago      Up 52 seconds       80/tcp, 443/tcp     drunk_bardeen
b8275c161c35        3e9c1b047c1d        "/opt/docker/bin/entr"   56 seconds ago      Up 55 seconds       80/tcp, 443/tcp     drunk_shirley
b2aadabd41e7        1dd5f3d478ce        "/opt/docker/bin/entr"   58 seconds ago      Up 57 seconds       80/tcp, 443/tcp     stupefied_jang

Example test output

  314) Dockerfile Command "apachectl -t -D DUMP_MODULES" stdout
       Failure/Error: its(:stdout) { should contain('negotiation_module') }
       Docker::Error::ServerError:
         Container 9f8c73f21555743c7b8e25f9c343343af0f7911bf7f97275bdeb91bcde8cac28 is not running

       Shared Example Group: "apache::modules" called from ./spec/collection/apache.rb:8
       Shared Example Group: "collection::apache" called from ./spec/docker/apache_spec.rb:13
       # ./vendor/ruby/2.3.0/gems/docker-api-1.28.0/lib/docker/connection.rb:50:in `rescue in request'
       # ./vendor/ruby/2.3.0/gems/docker-api-1.28.0/lib/docker/connection.rb:38:in `request'
       # ./vendor/ruby/2.3.0/gems/docker-api-1.28.0/lib/docker/connection.rb:65:in `block (2 levels) in <class:Connection>'
       # ./vendor/ruby/2.3.0/gems/docker-api-1.28.0/lib/docker/exec.rb:22:in `create'
       # ./vendor/ruby/2.3.0/gems/docker-api-1.28.0/lib/docker/container.rb:64:in `exec'
       # ./vendor/ruby/2.3.0/gems/specinfra-2.56.1/lib/specinfra/backend/docker.rb:94:in `docker_run!'
       # ./vendor/ruby/2.3.0/gems/specinfra-2.56.1/lib/specinfra/backend/docker.rb:31:in `run_command'
       # ./vendor/ruby/2.3.0/gems/specinfra-2.56.1/lib/specinfra/runner.rb:11:in `method_missing'
       # ./vendor/ruby/2.3.0/gems/serverspec-2.31.1/lib/serverspec/type/command.rb:17:in `command_result'
       # ./vendor/ruby/2.3.0/gems/serverspec-2.31.1/lib/serverspec/type/command.rb:4:in `stdout'
       # ./spec/shared/apache/modules.rb:21:in `block (3 levels) in <top (required)>'
       # ------------------
       # --- Caused by: ---
       # Excon::Errors::InternalServerError:
       #   Expected([200, 201, 202, 203, 204, 304]) <=> Actual(500 InternalServerError)
       #   ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/middlewares/expects.rb:6:in `response_call'

with 10 sec delay:

  627) Dockerfile File "/opt/docker/etc/httpd/ssl"
       Failure/Error: @image = Docker::Image.build_from_dir('.', { 'dockerfile' => ENV['DOCKERFILE'] })
       Excon::Errors::SocketError:
         Broken pipe (Errno::EPIPE)

       Shared Example Group: "apache::layout" called from ./spec/collection/apache.rb:2
       Shared Example Group: "collection::apache" called from ./spec/docker/apache_spec.rb:13
       # <internal:prelude>:134:in `__write_nonblock'
       # <internal:prelude>:134:in `write_nonblock'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/socket.rb:217:in `block in write_nonblock'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/socket.rb:212:in `loop'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/socket.rb:212:in `write_nonblock'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/socket.rb:70:in `write'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/connection.rb:152:in `request_call'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/middlewares/mock.rb:47:in `request_call'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/middlewares/instrumentor.rb:25:in `request_call'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/middlewares/base.rb:15:in `request_call'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/middlewares/base.rb:15:in `request_call'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/middlewares/base.rb:15:in `request_call'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/middlewares/base.rb:15:in `request_call'
       # ./vendor/ruby/2.3.0/gems/excon-0.49.0/lib/excon/connection.rb:250:in `request'
       # ./vendor/ruby/2.3.0/gems/docker-api-1.28.0/lib/docker/connection.rb:40:in `request'
       # ./vendor/ruby/2.3.0/gems/docker-api-1.28.0/lib/docker/connection.rb:65:in `block (2 levels) in <class:Connection>'
       # ./vendor/ruby/2.3.0/gems/docker-api-1.28.0/lib/docker/image.rb:253:in `build_from_tar'
       # ./vendor/ruby/2.3.0/gems/docker-api-1.28.0/lib/docker/image.rb:272:in `build_from_dir'
       # ./spec/docker/apache_spec.rb:7:in `block (2 levels) in <top (required)>'
       # ------------------
       # --- Caused by: ---
       # Errno::EPIPE:
       #   Broken pipe
       #   <internal:prelude>:134:in `__write_nonblock'

daemon log:

Apr 16 15:37:39 slave docker[9980]: time="2016-04-16T15:37:39.611097317+02:00" level=error msg="Error setting up exec command in container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437: Container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437 is not running"
Apr 16 15:37:39 slave docker[9980]: time="2016-04-16T15:37:39.611222645+02:00" level=error msg="Handler for POST /v1.16/containers/0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437/exec returned error: Container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437 is not running"
Apr 16 15:37:39 slave docker[9980]: time="2016-04-16T15:37:39.617722794+02:00" level=error msg="Error setting up exec command in container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437: Container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437 is not running"
Apr 16 15:37:39 slave docker[9980]: time="2016-04-16T15:37:39.617787219+02:00" level=error msg="Handler for POST /v1.16/containers/0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437/exec returned error: Container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437 is not running"
Apr 16 15:37:39 slave docker[9980]: time="2016-04-16T15:37:39.629750289+02:00" level=error msg="Error setting up exec command in container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437: Container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437 is not running"
Apr 16 15:37:39 slave docker[9980]: time="2016-04-16T15:37:39.629805666+02:00" level=error msg="Handler for POST /v1.16/containers/0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437/exec returned error: Container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437 is not running"
Apr 16 15:37:39 slave docker[9980]: time="2016-04-16T15:37:39.635010745+02:00" level=error msg="Error setting up exec command in container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437: Container 0f9bedf740f7f638d59c49cf79cecc41f5aeb09ee22c1e15f0c3152841fc8437 is not running"

.....................

Apr 16 15:52:22 slave docker[9980]: time="2016-04-16T15:52:22.101128639+02:00" level=error msg="Error running exec in container: Cannot run exec command 14fae347208be27e50a7362a357b405974f4d26004a9f47f15772dd53cf0501f in container 37a5d53167b7fc1f0cd8910798885259c437d5fc5f0b711abec943dc2682fa71: [9] System error: exit status 1\n"
Apr 16 15:52:22 slave docker[9980]: time="2016-04-16T15:52:22.320246649+02:00" level=warning msg="exit status 2"
Apr 16 15:52:22 slave docker[9980]: time="2016-04-16T15:52:22.320328439+02:00" level=error msg="Error running command in existing container bcad3f765f1b7e4513c14e65899f1b2b04228384fee6dc42b60971f6efe10660: [9] System error: write parent: broken pipe"
Apr 16 15:52:22 slave docker[9980]: time="2016-04-16T15:52:22.320542601+02:00" level=error msg="Error running exec in container: Cannot run exec command 5b1c8c0726d122589a9ae3c22f57a856864716b8a410f6506631f235cc41c356 in container bcad3f765f1b7e4513c14e65899f1b2b04228384fee6dc42b60971f6efe10660: [9] System error: write parent: broken pipe\n"
Apr 16 15:52:22 slave docker[9980]: time="2016-04-16T15:52:22.335980462+02:00" level=error msg="Error running command in existing container bcad3f765f1b7e4513c14e65899f1b2b04228384fee6dc42b60971f6efe10660: [9] System error: fork/exec /proc/self/exe: resource temporarily unavailable"
Apr 16 15:52:22 slave docker[9980]: time="2016-04-16T15:52:22.336214548+02:00" level=error msg="Error running exec in container: Cannot run exec command 6cbdadce4fd5e177499d82a18eab637a0a0679d928d2e5091caa94d6a7df1668 in container bcad3f765f1b7e4513c14e65899f1b2b04228384fee6dc42b60971f6efe10660: [9] System error: fork/exec /proc/self/exe: resource temporarily unavailable\n"
Apr 16 15:52:22 slave docker[9980]: time="2016-04-16T15:52:22.578612132+02:00" level=warning msg="exit status 2"
Apr 16 15:52:22 slave docker[9980]: time="2016-04-16T15:52:22.578689989+02:00" level=error msg="Error running command in existing container bcad3f765f1b7e4513c14e65899f1b2b04228384fee6dc42b60971f6efe10660: [9] System error: read parent: connection reset by peer"
Apr 16 15:52:22 slave docker[9980]: time="2016-04-16T15:52:22.583088012+02:00" level=error msg="Error running exec in container: Cannot run exec command d3e110a3534d710b461fb62f3a3a57550af50065fb052f520cc1826e4e966b78 in container bcad3f765f1b7e4513c14e65899f1b2b04228384fee6dc42b60971f6efe10660: [9] System error: read parent: connection reset by peer\n"
Apr 16 15:52:23 slave docker[9980]: time="2016-04-16T15:52:23.419344957+02:00" level=warning msg="exit status 2"
Apr 16 15:52:23 slave docker[9980]: time="2016-04-16T15:52:23.419446217+02:00" level=error msg="Error running command in existing container 1e17b899e6bab476eb14bd29b78527765338ca2b77e31f101c81188251d8b744: [9] System error: read parent: connection reset by peer"
Apr 16 15:52:23 slave docker[9980]: time="2016-04-16T15:52:23.419668432+02:00" level=error msg="Error running exec in container: Cannot run exec command b458ff9c4f3e5116796b33e6020d585554e7d5111223be637fa16b50664d556d in container 1e17b899e6bab476eb14bd29b78527765338ca2b77e31f101c81188251d8b744: [9] System error: read parent: connection reset by peer\n"
Apr 16 15:52:23 slave docker[9980]: time="2016-04-16T15:52:23.441046633+02:00" level=warning msg="exit status 2"
Apr 16 15:52:23 slave docker[9980]: time="2016-04-16T15:52:23.441141575+02:00" level=error msg="Error running command in existing container 1e17b899e6bab476eb14bd29b78527765338ca2b77e31f101c81188251d8b744: [9] System error: read parent: connection reset by peer"
Apr 16 15:52:23 slave docker[9980]: time="2016-04-16T15:52:23.441344901+02:00" level=error msg="Error running exec in container: Cannot run exec command 28305262aff3ea67854ee79eef1e84496b4675bc67c6602a206cf2564d53c941 in container 1e17b899e6bab476eb14bd29b78527765338ca2b77e31f101c81188251d8b744: [9] System error: read parent: connection reset by peer\n"
Apr 16 15:52:24 slave docker[9980]: time="2016-04-16T15:52:24.793693003+02:00" level=warning msg="exit status 2"
Apr 16 15:52:24 slave docker[9980]: time="2016-04-16T15:52:24.793768787+02:00" level=error msg="Error running command in existing container 37a5d53167b7fc1f0cd8910798885259c437d5fc5f0b711abec943dc2682fa71: [9] System error: read parent: connection reset by peer"
mblaschke commented 8 years ago

Tests are still failing randomly on Docker 1.11.0 but i think it's a Docker issue. On 1.10.3 it's now working.

The issue because of fork/exec /proc/self/exe: resource temporarily unavailable was the process limit of systemd which can be fixed (temporary) with eg systemctl set-property docker.service TasksMax=8192.

hekaldama commented 8 years ago

@mblaschke are you still getting failures? I was working on 1.12 support and found it to be failing as soon as I ran rake. Was going to look into fixing failures but wondering if you made any headway on any of this. Thanks.

@tlunter are their known failures in the test suite right now?

tlunter commented 8 years ago

@hekaldama all tests should run correctly under the latest version of docker.

@mblaschke curious why you're running the tests in a parallel manner? Many of them check the value before and after a method is run and would definitely not work when run in parallel.

hekaldama commented 8 years ago

@tlunter I am currently fixing one now that does not run on docker 1.12. https://github.com/swipely/docker-api/blob/master/spec/docker/image_spec.rb#L263-L265 now returns Docker::Error::ClientError. I am also working on the 1.12 interface with tests. Will send PR soon.

mblaschke commented 7 years ago

@tlunter Testing ~180 in serial mode takes too long so we're running severspec parallel in seperated processes.

However this seems to be a docker daemon issue

mblaschke commented 7 years ago

Seems to be fixed in docker (upstream)

jadametz commented 7 years ago

I'm running into this issue now with the latest version of Docker for Mac and 17.06.0-ce on Ubuntu. Did this end up being a problem with Docker, docker-api, or some lower level library?

Any guidance would be a huge help as I'm trying to build out a pipeline of Docker image builds and this is a huge block in not being able to implement what we thought we'd be able to.

jdickey commented 5 years ago

Two and a half years after this issue was closed, I'm still running into it with Docker version 19.03.1, build 74b1e89 on macOS 10.14.6. Given the immediate response to @jadametz's comment above, I'm beginning to think that there's really no interest in supporting a Docker API client in Ruby that supports building images.

[16] pry(main)> build_output = ''
=> ""
[17] pry(main)> proc = Proc.new { |chunk| build_output << chunk; }
=> #<Proc:0x00007fba80a2dbf8@(pry):27>
[18] pry(main)> img = Docker::Image.build_from_dir('.', options) { |chunk| proc.call(chunk) }
Excon::Error::Socket: Broken pipe (Errno::EPIPE)
from <internal:prelude>:131:in `__write_nonblock'
Caused by Errno::EPIPE: Broken pipe
from <internal:prelude>:131:in `__write_nonblock'
[19] pry(main)> 
tlunter commented 5 years ago

@jdickey tone in your comment seems harsh. The original issue was closed with this being an issue with the daemon. We use and other companies use this client to build images without issue all the time. Please let me know if you have constructive information to add to the gem that can fix this issue.

jdickey commented 5 years ago

@tlunter Apologies for the tone; that day was a long and frustrating fortnight. Will gather more information and comment again another time.

Thanks for the reply.