test-kitchen / kitchen-dokken

Test Kitchen driver/provisioner for lightning faster Chef Infra cookbook testing with Docker
Other
197 stars 91 forks source link

remote docker connections (tcp) currently not working #89

Closed rmoriz closed 6 years ago

rmoriz commented 7 years ago

Just to make it clear that the current kitchen-dokken version does not work if one decides to use a remote dockerd, e.g. using tcp instead of a unix socket.

Also related: https://github.com/someara/kitchen-dokken/issues/79

It looks like there are local volume mounts which by design don't work remotely.

someara commented 7 years ago

bizarre... it's explicitly tested against tcp here

https://github.com/someara/kitchen-dokken/blob/master/test/cookbooks/dokken_test/recipes/default.rb#L64

When using tcp, it uses the old 1.x style "data container" with ssh/rsync https://github.com/someara/kitchen-dokken/blob/master/lib/kitchen/driver/dokken.rb#L69-L72

and

https://github.com/someara/kitchen-dokken/blob/master/lib/kitchen/helpers.rb#L41

Can you help me recreate your problem?

-s

rmoriz commented 7 years ago

https://github.com/someara/kitchen-dokken/issues/44 has some details and https://github.com/someara/kitchen-dokken/issues/79#issuecomment-287539395

(/Users/rmoriz was the OSX host, dockerd however was running on another machine via tcp. ENV-setting was done via eval $(docker-machine env docker-node))

I have stuff on OSX in /Users/rmoriz/.dokken/verifier_sandbox/... and unresolved volume mounts on the container to that(!) directories.

I assume that in your tests the connection to dockerd is made by tcp but dockerd still has access to locally mount the directories mentioned above. So technically it's still a local instance of docker.

You can test my issue simply by using docker-machine and a cloud provider of your choice:

e.g. https://docs.docker.com/machine/drivers/digital-ocean/ https://docs.docker.com/machine/drivers/aws/


$ docker-machine create --driver digitalocean ...  server-name
$ eval $(docker-machine env server-name)

# now DOCKER_HOST should point to tcp://<ip of the cloud instance>)
# to unset env: eval $(docker-machine env -u)
rmoriz commented 7 years ago

Maybe existence of a local sandbox due to a prior, local docker run, might mislead dokken to assume a local volume mount would be possible.

I use docker for mac ("local" docker) for small things and a powerful dual xeon box for heavy "remote" docker work. I'll clean the '~/.dokken' and try again next week.

rmoriz commented 7 years ago

nope. Didn't help. It's broken:

➜  rabbitmq-chef-cookbook git:(rmoriz/inspec) rm -rf ~/.dokken
➜  rabbitmq-chef-cookbook git:(rmoriz/inspec) rm -rf ~/.kitchen/*
➜  rabbitmq-chef-cookbook git:(rmoriz/inspec) eval $(docker-machine env rambo)
➜  rabbitmq-chef-cookbook git:(rmoriz/inspec) env | grep -i docker
DOCKER_TLS_VERIFY=1
DOCKER_HOST=tcp://10.0.1.49:2376
DOCKER_CERT_PATH=/Users/rmoriz/.docker/machine/machines/rambo
DOCKER_MACHINE_NAME=rambo
➜  rabbitmq-chef-cookbook git:(rmoriz/inspec) kitchen setup lwrps-centos-67
-----> Starting Kitchen (v1.16.0)
WARN: Unresolved specs during Gem::Specification.reset:
      artifactory (>= 0)
      winrm (~> 2.0)
      logging (< 3.0, >= 1.6.1)
      rainbow (~> 2)
      diff-lcs (< 2.0, >= 1.2.0)
      unf_ext (>= 0)
WARN: Clearing out unresolved specs.
Please report a bug if this causes problems.
-----> Creating <lwrps-centos-67>...
       Creating kitchen sandbox at /Users/rmoriz/.dokken/kitchen_sandbox/a2abe14b87-lwrps-centos-67
       Creating verifier sandbox at /Users/rmoriz/.dokken/verifier_sandbox/a2abe14b87-lwrps-centos-67
       Building work image..
       Finished creating <lwrps-centos-67> (0m48.15s).
-----> Converging <lwrps-centos-67>...
       Creating kitchen sandbox in /Users/rmoriz/.dokken/kitchen_sandbox/a2abe14b87-lwrps-centos-67
       Preparing dna.json
       Resolving cookbook dependencies with Berkshelf 5.6.4...
       Removing non-cookbook files before transfer
       Preparing validation.pem
       Preparing client.rb
[2017-04-24T21:10:02+00:00] WARN: *****************************************
[2017-04-24T21:10:02+00:00] WARN: Did not find config file: /opt/kitchen/client.rb, using command line options.
[2017-04-24T21:10:02+00:00] WARN: *****************************************
[2017-04-24T21:10:02+00:00] WARN: No cookbooks directory found at or above current directory.  Assuming /.
[2017-04-24T21:10:02+00:00] FATAL: Cannot load configuration from /opt/kitchen/dna.json
>>>>>> ------Exception-------
>>>>>> Class: Kitchen::ActionFailed
>>>>>> Message: 1 actions failed.
>>>>>>     Converge failed on instance <lwrps-centos-67>.  Please see .kitchen/logs/lwrps-centos-67.log for more details
>>>>>> ----------------------
>>>>>> Please see .kitchen/logs/kitchen.log for more details
>>>>>> Also try running `kitchen diagnose --all` for configuration

kitchen setup lwrps-centos-67  4,01s user 1,64s system 10% cpu 55,907 total
➜  rabbitmq-chef-cookbook git:(rmoriz/inspec) docker ps 
CONTAINER ID        IMAGE                               COMMAND                  CREATED              STATUS              PORTS                   NAMES
3fee546cdb24        a2abe14b87-lwrps-centos-67:latest   "/sbin/init"             About a minute ago   Up About a minute                           a2abe14b87-lwrps-centos-67
8491563dd99b        dokken/kitchen-cache:latest         "/usr/sbin/sshd -D..."   2 minutes ago        Up 2 minutes        0.0.0.0:10000->22/tcp   a2abe14b87-lwrps-centos-67-data

Here is the error. Source = Path on my local machine. This does not work. You cannot mount a volume over the network on docker.

➜  rabbitmq-chef-cookbook git:(rmoriz/inspec) docker inspect a2abe14b87-lwrps-centos-67 | jq '.[] | .Mounts'    
[
  {
    "Name": "3c7585f61567977478bd54eca17fd152aa3bd118a6444258f9003e116fed8015",
    "Source": "/srv/docker/docker/volumes/3c7585f61567977478bd54eca17fd152aa3bd118a6444258f9003e116fed8015/_data",
    "Destination": "/opt/chef",
    "Driver": "local",
    "Mode": "",
    "RW": true,
    "Propagation": ""
  },
  {
    "Type": "bind",
    "Source": "/Users/rmoriz/.dokken/kitchen_sandbox/a2abe14b87-lwrps-centos-67",
    "Destination": "/opt/kitchen",
    "Mode": "",
    "RW": true,
    "Propagation": ""
  },
  {
    "Type": "bind",
    "Source": "/Users/rmoriz/.dokken/verifier_sandbox/a2abe14b87-lwrps-centos-67",
    "Destination": "/opt/verifier",
    "Mode": "",
    "RW": true,
    "Propagation": ""
  },
  {
    "Type": "bind",
    "Source": "/sys/fs/cgroup",
    "Destination": "/sys/fs/cgroup",
    "Mode": "",
    "RW": true,
    "Propagation": ""
  }
]

Complete inspect:

```json [ { "Id": "3fee546cdb24f968b38aeeb25d904e31ea600759856a8c2c24ce926bcfc1c31e", "Created": "2017-04-24T21:10:00.053652424Z", "Path": "/sbin/init", "Args": [], "State": { "Status": "running", "Running": true, "Paused": false, "Restarting": false, "OOMKilled": false, "Dead": false, "Pid": 37882, "ExitCode": 0, "Error": "", "StartedAt": "2017-04-24T21:10:00.517261764Z", "FinishedAt": "0001-01-01T00:00:00Z" }, "Image": "sha256:e7a0b2980929af7716dc8a0ca210b00e12b92c4ebd922e03dc1013808ce90ab9", "ResolvConfPath": "/srv/docker/docker/containers/3fee546cdb24f968b38aeeb25d904e31ea600759856a8c2c24ce926bcfc1c31e/resolv.conf", "HostnamePath": "/srv/docker/docker/containers/3fee546cdb24f968b38aeeb25d904e31ea600759856a8c2c24ce926bcfc1c31e/hostname", "HostsPath": "/srv/docker/docker/containers/3fee546cdb24f968b38aeeb25d904e31ea600759856a8c2c24ce926bcfc1c31e/hosts", "LogPath": "/srv/docker/docker/containers/3fee546cdb24f968b38aeeb25d904e31ea600759856a8c2c24ce926bcfc1c31e/3fee546cdb24f968b38aeeb25d904e31ea600759856a8c2c24ce926bcfc1c31e-json.log", "Name": "/a2abe14b87-lwrps-centos-67", "RestartCount": 0, "Driver": "aufs", "MountLabel": "", "ProcessLabel": "", "AppArmorProfile": "unconfined", "ExecIDs": null, "HostConfig": { "Binds": [ "/Users/rmoriz/.dokken/kitchen_sandbox/a2abe14b87-lwrps-centos-67:/opt/kitchen", "/Users/rmoriz/.dokken/verifier_sandbox/a2abe14b87-lwrps-centos-67:/opt/verifier", "/sys/fs/cgroup:/sys/fs/cgroup" ], "ContainerIDFile": "", "LogConfig": { "Type": "json-file", "Config": {} }, "NetworkMode": "bridge", "PortBindings": {}, "RestartPolicy": { "Name": "", "MaximumRetryCount": 0 }, "AutoRemove": false, "VolumeDriver": "", "VolumesFrom": [ "chef-current", "a2abe14b87-lwrps-centos-67-data" ], "CapAdd": [], "CapDrop": [], "Dns": null, "DnsOptions": null, "DnsSearch": null, "ExtraHosts": null, "GroupAdd": null, "IpcMode": "", "Cgroup": "", "Links": null, "OomScoreAdj": 0, "PidMode": "", "Privileged": true, "PublishAllPorts": false, "ReadonlyRootfs": false, "SecurityOpt": [ "label=disable" ], "UTSMode": "", "UsernsMode": "", "ShmSize": 67108864, "Runtime": "runc", "ConsoleSize": [ 0, 0 ], "Isolation": "", "CpuShares": 0, "Memory": 0, "NanoCpus": 0, "CgroupParent": "", "BlkioWeight": 0, "BlkioWeightDevice": null, "BlkioDeviceReadBps": null, "BlkioDeviceWriteBps": null, "BlkioDeviceReadIOps": null, "BlkioDeviceWriteIOps": null, "CpuPeriod": 0, "CpuQuota": 0, "CpuRealtimePeriod": 0, "CpuRealtimeRuntime": 0, "CpusetCpus": "", "CpusetMems": "", "Devices": null, "DeviceCgroupRules": null, "DiskQuota": 0, "KernelMemory": 0, "MemoryReservation": 0, "MemorySwap": 0, "MemorySwappiness": -1, "OomKillDisable": false, "PidsLimit": 0, "Ulimits": null, "CpuCount": 0, "CpuPercent": 0, "IOMaximumIOps": 0, "IOMaximumBandwidth": 0 }, "GraphDriver": { "Data": null, "Name": "aufs" }, "Mounts": [ { "Name": "3c7585f61567977478bd54eca17fd152aa3bd118a6444258f9003e116fed8015", "Source": "/srv/docker/docker/volumes/3c7585f61567977478bd54eca17fd152aa3bd118a6444258f9003e116fed8015/_data", "Destination": "/opt/chef", "Driver": "local", "Mode": "", "RW": true, "Propagation": "" }, { "Type": "bind", "Source": "/Users/rmoriz/.dokken/kitchen_sandbox/a2abe14b87-lwrps-centos-67", "Destination": "/opt/kitchen", "Mode": "", "RW": true, "Propagation": "" }, { "Type": "bind", "Source": "/Users/rmoriz/.dokken/verifier_sandbox/a2abe14b87-lwrps-centos-67", "Destination": "/opt/verifier", "Mode": "", "RW": true, "Propagation": "" }, { "Type": "bind", "Source": "/sys/fs/cgroup", "Destination": "/sys/fs/cgroup", "Mode": "", "RW": true, "Propagation": "" } ], "Config": { "Hostname": "localhost", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" ], "Cmd": [ "/sbin/init" ], "Image": "a2abe14b87-lwrps-centos-67:latest", "Volumes": null, "WorkingDir": "", "Entrypoint": null, "OnBuild": null, "Labels": { "build-date": "20170406", "license": "GPLv2", "name": "CentOS Base Image", "vendor": "CentOS" } }, "NetworkSettings": { "Bridge": "", "SandboxID": "a8393dd511ea3a539f6973863f22e81bc1b56d85050d627860339e501d3eb822", "HairpinMode": false, "LinkLocalIPv6Address": "", "LinkLocalIPv6PrefixLen": 0, "Ports": {}, "SandboxKey": "/var/run/docker/netns/a8393dd511ea", "SecondaryIPAddresses": null, "SecondaryIPv6Addresses": null, "EndpointID": "701b2c13efc0d1e253998bf36212df4ac4d7949e34622acb24b78e345b877fac", "Gateway": "172.17.0.1", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "IPAddress": "172.17.0.4", "IPPrefixLen": 16, "IPv6Gateway": "", "MacAddress": "02:42:ac:11:00:04", "Networks": { "bridge": { "IPAMConfig": null, "Links": null, "Aliases": null, "NetworkID": "be22d1fdbc61892ac75d7736cebf69016415e70037f83fd787908762b0ae1b51", "EndpointID": "701b2c13efc0d1e253998bf36212df4ac4d7949e34622acb24b78e345b877fac", "Gateway": "172.17.0.1", "IPAddress": "172.17.0.4", "IPPrefixLen": 16, "IPv6Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "MacAddress": "02:42:ac:11:00:04" } } } } ] ```
rmoriz commented 7 years ago

This code path never gets executed…

https://github.com/someara/kitchen-dokken/blob/master/lib/kitchen/transport/dokken.rb#L78-L131

sadly Travis does not allow running VirtualBox so we could easily use docker-machine to create a remote dockerd and expose the problem :(

someara commented 7 years ago

I'll try and think of a way to set up this condition in tests.

rmoriz commented 7 years ago

I tried to setup Virtualbox/Vagrant in Travis but this didn't work because the VM does not provide hardware virtualization. :(

rmoriz commented 7 years ago

I've added a Docker-in-Docker instance which should expose the issue: https://travis-ci.org/someara/kitchen-dokken/builds/228435669

This nested setup is equal to a remote dockerd where no local volume mounts are possible: The main Travis-CI VM instance can't local-mount a volume into/from a DIND container's container

someara commented 7 years ago

The tests already run docker-in-docker.

https://github.com/someara/kitchen-dokken/blob/7fcb710e89fcc7a6b1643da4fe8e0d4506fa35e8/test/cookbooks/dokken_test/recipes/default.rb#L1-L4

... which works because we mount /var/lib/docker as a volume so it shows up as a block device

(kitchen login default, mount | grep docker) /dev/vda1 on /var/lib/docker type ext4 (rw,relatime,data=ordered)

https://github.com/someara/kitchen-dokken/blob/7fcb710e89fcc7a6b1643da4fe8e0d4506fa35e8/.kitchen.yml#L6

The code is then called explicitly with a tcp DOCKER_HOST here

https://github.com/someara/kitchen-dokken/blob/7fcb710e89fcc7a6b1643da4fe8e0d4506fa35e8/test/cookbooks/dokken_test/recipes/default.rb#L57-L65

rmoriz commented 7 years ago

But your setup is still a local docker instance, despite the use of tcp instead of a unix socket. See https://travis-ci.org/someara/kitchen-dokken/builds/228435669

someara commented 7 years ago

@rmoriz can you check if this is fixed up for you in the latest version? -s

rmoriz commented 7 years ago

@someara rebased master in the test PR #101 - looks like it's still failing

jjasghar commented 7 years ago

Running Docker in Docker, then running dokken inside that container to run kitchen seems (cough jenkins cough) seems to recreate this issue. If yall need another tester/verifier i'm here to help.

rmoriz commented 7 years ago

@jjasghar actually that is was PR #101 does

someara commented 7 years ago

I think I finally fixed this with in 2.4.1. Does it work for y'all?

rmoriz commented 7 years ago

@somera can you merge PR #101 which should add a test coverage for the remote scenario.

tas50 commented 6 years ago

Closing this out since we have the fix, but #101 will add a bit of testing, which we certainly want