test-kitchen / kitchen-docker

A Test Kitchen Driver for Docker
Apache License 2.0
462 stars 232 forks source link

SSH connection failed #40

Open dgivens opened 10 years ago

dgivens commented 10 years ago

I'm running into an issue where is appears test-kitchen isn't waiting long enough to SSH when I'm using /sbin/init for my run_command. If I wait a couple of second, I can then ssh into the instance. Is this something that can be addressed by kitchen-docker or are there any workarounds I might employ?

I'm trying to solve the problem of runit not starting when the package has been installed. On Debian systems, which I'm testing on, it starts via the inittab and the runit cookbook handles this by issuing a telinit once the package is installed.

$ sudo kitchen converge
-----> Starting Kitchen (v1.2.1)
-----> Creating <default-debian-74>...
       Step 0 : FROM dgivens/wheezy
        ---> f8c92d987c7a
       Step 1 : ENV DEBIAN_FRONTEND noninteractive
        ---> Using cache
        ---> 47cd72727fd3
       Step 2 : RUN dpkg-divert --local --rename --add /sbin/initctl
        ---> Using cache
        ---> 68a0adca627d
       Step 3 : RUN ln -sf /bin/true /sbin/initctl
        ---> Using cache
        ---> 22aa8539b9a5
       Step 4 : RUN apt-get update
        ---> Using cache
        ---> e7a978183de9
       Step 5 : RUN apt-get install -y sudo openssh-server curl lsb-release
        ---> Using cache
        ---> c3c4b40c5f5f
       Step 6 : RUN mkdir -p /var/run/sshd
        ---> Using cache
        ---> 44186f2bdfc1
       Step 7 : RUN useradd -d /home/kitchen -m -s /bin/bash kitchen
        ---> Using cache
        ---> 7bd45f37b4a4
       Step 8 : RUN echo kitchen:kitchen | chpasswd
        ---> Using cache
        ---> b20a2a304c97
       Step 9 : RUN echo 'kitchen ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
        ---> Using cache
        ---> bec122d32801
       Successfully built bec122d32801
       e79dce6f1a2b245a4314ece432944f0e93f33783b6e75d76783cbed778b2b652
       [{
           "ID": "e79dce6f1a2b245a4314ece432944f0e93f33783b6e75d76783cbed778b2b652",
           "Created": "2014-03-07T13:59:25.340911399Z",
           "Path": "/sbin/init",
           "Args": [],
           "Config": {
        "Hostname": "e79dce6f1a2b",
        "Domainname": "",
        "User": "",
        "Memory": 0,
        "MemorySwap": 0,
        "CpuShares": 0,
        "AttachStdin": false,
        "AttachStdout": false,
        "AttachStderr": false,
        "PortSpecs": null,
        "ExposedPorts": {
            "22/tcp": {}
        },
        "Tty": false,
        "OpenStdin": false,
        "StdinOnce": false,
        "Env": [
            "HOME=/",
            "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
            "DEBIAN_FRONTEND=noninteractive"
        ],
        "Cmd": [
            "/sbin/init"
        ],
        "Dns": null,
        "Image": "bec122d32801",
        "Volumes": null,
        "VolumesFrom": "",
        "WorkingDir": "",
        "Entrypoint": null,
        "NetworkDisabled": false,
        "OnBuild": null
           },
           "State": {
        "Running": true,
        "Pid": 12513,
        "ExitCode": 0,
        "StartedAt": "2014-03-07T13:59:25.491002705Z",
        "FinishedAt": "0001-01-01T00:00:00Z",
        "Ghost": false
           },
           "Image": "bec122d328011c78d9fe642c0b8d858894c9a5271da51b465831a6e718c935a2",
           "NetworkSettings": {
        "IPAddress": "172.17.0.2",
        "IPPrefixLen": 16,
        "Gateway": "172.17.42.1",
        "Bridge": "docker0",
        "PortMapping": null,
        "Ports": {
            "22/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49195"
                }
            ]
        }
           },
           "ResolvConfPath": "/etc/resolv.conf",
           "HostnamePath": "/var/lib/docker/containers/e79dce6f1a2b245a4314ece432944f0e93f33783b6e75d76783cbed778b2b652/hostname",
           "HostsPath": "/var/lib/docker/containers/e79dce6f1a2b245a4314ece432944f0e93f33783b6e75d76783cbed778b2b652/hosts",
           "Name": "/dreamy_davinci5",
           "Driver": "aufs",
           "Volumes": {},
           "VolumesRW": {},
           "HostConfig": {
        "Binds": null,
        "ContainerIDFile": "",
        "LxcConf": [],
        "Privileged": false,
        "PortBindings": {
            "22/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49195"
                }
            ]
        },
        "Links": null,
        "PublishAllPorts": false
           }
       }]
       Finished creating <default-debian-74> (0m0.55s).
-----> Converging <default-debian-74>...
       Preparing files for transfer
       Resolving cookbook dependencies with Berkshelf 2.0.14...
       Removing non-cookbook files before transfer
       Preparing data bags
       Preparing environments
       Preparing encrypted data bag secret
       [SSH] connection failed, retrying (#<Net::SSH::Disconnect: connection closed by remote host>)
       [SSH] connection failed, retrying (#<Net::SSH::Disconnect: connection closed by remote host>)
$$$$$$ [SSH] connection failed, terminating (#<Net::SSH::Disconnect: connection closed by remote host>)
>>>>>> Converge failed on instance <default-debian-74>.
>>>>>> Please see .kitchen/logs/default-debian-74.log for more details
>>>>>> ------Exception-------
>>>>>> Class: Kitchen::ActionFailed
>>>>>> Message: connection closed by remote host
>>>>>> ----------------------
daniel.givens@jenkins-n02:~/fusion$ ssh kitchen@localhost -p 49195
The authenticity of host '[localhost]:49195 ([127.0.0.1]:49195)' can't be established.
ECDSA key fingerprint is 84:80:c6:6b:86:bd:47:ed:35:53:0c:e2:99:07:bd:99.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[localhost]:49195' (ECDSA) to the list of known hosts.
kitchen@localhost's password: 
Linux e79dce6f1a2b 3.11.0-13-generic #20-Ubuntu SMP Wed Oct 23 07:38:26 UTC 2013 x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
kitchen@e79dce6f1a2b:~$ 
dgivens commented 10 years ago

I'm finding that test-kitchen with vagrant-lxc is a better fit.

portertech commented 10 years ago

@fnichol can we increase the number of SSH attempts for asset scp, or make it configurable?

portertech commented 10 years ago

Hmm, wait_for_ssh() should be avoiding this issue.

damm commented 10 years ago

Isn't there some bit about Ubuntu restarting sshd at startup? Could be why it's connection closed by remote host

dustinmm80 commented 10 years ago

I'm also running into this issue, running on Jenkins:

       Finished converging <building-tagger-ubuntu-1204> (4m30.00s).
-----> Setting up <building-tagger-ubuntu-1204>...
       [SSH] connection failed, retrying (#<Net::SSH::Disconnect: connection closed by remote host>)
       [SSH] connection failed, retrying (#<Net::SSH::Disconnect: connection closed by remote host>)
$$$$$$ [SSH] connection failed, terminating (#<Net::SSH::Disconnect: connection closed by remote host>)
fnichol commented 10 years ago

After looking into this, it looks like the code that establishes an SSH connection in Test Kitchen retries but doesn't pause between attempts (unlike the #wait_for_ssh logic). As a result @portertech and I have been testing out https://github.com/test-kitchen/test-kitchen/pull/399 today with reasonable success. I'd like to add a bit more configuration for Driver authors and then merge it into Test Kitchen core.

bplunkert commented 10 years ago

I too am experiencing this issue, and I believe test-kitchen/test-kitchen/pull/454 may offer a workaround and/or solution if it is merged.

vitalis commented 10 years ago

I'm facing same issue, Did someone found a solution?

Yserz commented 9 years ago

Is it possible that this issue is not with SSH'ing but with SCP? Can you try to copy something with SCP into the container or install openssh-server+openssh-client (SCP is in client) on the container.

xacaxulu commented 9 years ago

+1

azazi-sa commented 7 years ago

+1

-----> Starting Kitchen (v1.15.0)
-----> Creating <default-centos-73>...
       Sending build context to Docker daemon 24.58 MB
       Step 1/7 : FROM centos:7
        ---> 67591570dd29
       Step 2/7 : MAINTAINER "msameera" <sameer@mail.com>
        ---> Using cache
        ---> dfeeb2440e5a
       Step 3/7 : ENV container docker
        ---> Using cache
        ---> 3cdba08c07a6
       Step 4/7 : EXPOSE 32773
        ---> Running in 583109b9c836
        ---> 779a403e4c47
       Removing intermediate container 583109b9c836
       Step 5/7 : RUN (cd /lib/systemd/system/sysinit.target.wants/; for i in *; do [ $i == systemd-tmpfiles-setup.service ] || rm -f $i; done); rm -f /lib/systemd/system/multi-user.target.wants/*;rm -f /etc/systemd/system/*.wants/*;rm -f /lib/systemd/system/local-fs.target.wants/*; rm -f /lib/systemd/system/sockets.target.wants/*udev*; rm -f /lib/systemd/system/sockets.target.wants/*initctl*; rm -f /lib/systemd/system/basic.target.wants/*;rm -f /lib/systemd/system/anaconda.target.wants/*;
        ---> Running in 2451fba617e7
        ---> 2bcb0ff84da1
       Removing intermediate container 2451fba617e7
       Step 6/7 : VOLUME /sys/fs/cgroup
        ---> Running in 0166c6ef6be3
        ---> 509d6f7a7309
       Removing intermediate container 0166c6ef6be3
       Step 7/7 : CMD /usr/sbin/init
        ---> Running in fd5db50d5441
        ---> 2e0bda625ddc
       Removing intermediate container fd5db50d5441
       Successfully built 2e0bda625ddc
       f65e491a3cd55b21cdf56f0bbc308776ae82f0ad9fa7256fcf3d74d2a53e276e
       0.0.0.0:32774
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
jcalonso commented 7 years ago

I'm having the same problem and I found a workaround:

rgarrigue commented 5 years ago

I've the same

       Waiting for SSH service on localhost:32776, retrying in 3 seconds
       Waiting for SSH service on localhost:32776, retrying in 3 seconds
       Waiting for SSH service on localhost:32776, retrying in 3 seconds

I got rid of it commenting out run_command: /lib/systemd/systemd.

Follow up problem is, of course no services are running, can't test my stuff properly :-(

EugenMayer commented 5 years ago

I have the same issue when starting systemd, seems like the vanilla startup-command, which run_command: /bin/systemd is replacing, is dealing with the ssh server setup - and is missing otherwise.

Since i require upstart / systemd docker is road blocked for me in https://github.com/EugenMayer/chef-tinc-cookbook/blob/master/.kitchen.docker.yml

rgarrigue commented 5 years ago

Here's our kitchen.yml, which include our workaround for this issue, the first provision command. The second one fix a testinfra related issue which may be useful for other tools relying on /sbin/init.

---
driver:
  name: docker
  use_sudo: false
  provision_command:
    - rm /lib/systemd/system/ssh.service
    - '[ ! -f /sbin/init ] && ln -s /lib/systemd/systemd /sbin/init || true'
  run_command: /bin/systemd
  privileged: true
  volume:
    - "/sys/fs/cgroup:/sys/fs/cgroup:ro"
  dns:
    - 1.1.1.1
    - 9.9.9.9

transport:
  name: sftp

platforms:
  - name: stretch
    driver_config:
      image: jrei/systemd-debian:9
      platform: debian
  - name: buster
    driver_config:
      image: jrei/systemd-debian:10
      platform: debian

suites:
  - name: nitrogen
    provisioner:
      salt_bootstrap_options: -X -p git -x python2.7 stable 2017.7
  - name: fluorine
    provisioner:
      salt_bootstrap_options: -X -p git -x python2.7 stable 2019.2

provisioner:
  name: salt_solo
  salt_install: bootstrap
  is_file_root: true
  require_chef: true
  salt_copy_filter:
    - .git
  dependencies:
  - name: common
    repo: git
    source: https://gitlab+deploy-token-6:REDACTED@gitlab.REDACTED/salt/common-formula
    branch: dev
  state_top:
    base:
      "*":
        - bender
  pillars_from_files:
    pillar.sls: test/pillar.sls
  pillars:
    top.sls:
      base:
        "*":
          - pillar

verifier:
  name: shell
  remote_exec: false
  command: pytest --junitxml=test/${KITCHEN_INSTANCE}_test_report.xml --html=test/${KITCHEN_INSTANCE}_test_report.html --self-contained-html --color=yes --host="docker://root@${KITCHEN_CONTAINER_ID}" "test/integration/"
EugenMayer commented 5 years ago

i fixed it right now by doing

  run_command: /bin/systemd
  provision_command:
    - apt-get install systemd -y
  disable_upstart: false