digitalocean / droplet_kit

DropletKit is the official DigitalOcean API client for Ruby.
MIT License
514 stars 152 forks source link

ssh_keys not working on droplet create #80

Open shortdudey123 opened 8 years ago

shortdudey123 commented 8 years ago

I am trying to create a droplet, however, the ssh_keys array does not appear to get passed along during creation

[20] pry(main)> client.ssh_keys.find(id: 1234567)
=> <DropletKit::SSHKey {:@id=>1234567, :@fingerprint=>"XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX", :@public_key=>"ssh-rsa XXXXXXXXXXXX", :@name=>"ssh key name"}>
[21] pry(main)> droplet
=> <DropletKit::Droplet {:@name=>"new-droplet", :@size=>"2gb", :@image=>"centos-7-0-x64", :@region=>"nyc1", :@ssh_keys=>["1234567"], :@private_networking=>true, :@backups=>nil, :@ipv6=>nil, :@id=>nil, :@memory=>nil, :@vcpus=>nil, :@disk=>nil, :@locked=>nil, :@created_at=>nil, :@status=>nil, :@backup_ids=>nil, :@snapshot_ids=>nil, :@action_ids=>nil, :@features=>nil, :@networks=>nil, :@kernel=>nil, :@size_slug=>nil, :@user_data=>nil}>
[22] pry(main)> server = client.droplets.create(droplet)
=> <DropletKit::Droplet {:@name=>"new-droplet", :@size=>nil, :@image=><DropletKit::Image {:@id=>14782842, :@name=>"7.1 x64", :@distribution=>"CentOS", :@slug=>"centos-7-0-x64", :@public=>true, :@regions=>["nyc1", "sfo1", "nyc2", "ams2", "sgp1", "lon1", "nyc3", "ams3", "fra1", "tor1"], :@type=>"snapshot"}>, :@region=><DropletKit::Region {:@slug=>"nyc1", :@name=>"New York 1", :@sizes=>["512mb", "8gb", "16gb", "32gb", "48gb", "64gb", "1gb", "2gb", "4gb"], :@available=>true, :@features=>["private_networking", "backups", "ipv6", "metadata"]}>, :@ssh_keys=>nil, :@private_networking=>nil, :@backups=>nil, :@ipv6=>nil, :@id=>11162726, :@memory=>2048, :@vcpus=>2, :@disk=>40, :@locked=>true, :@created_at=>"2016-02-16T20:01:36Z", :@status=>"new", :@backup_ids=>[], :@snapshot_ids=>[], :@action_ids=>nil, :@features=>["virtio"], :@networks=>#<struct DropletKit::NetworkHash v4=[], v6=[]>, :@kernel=><DropletKit::Kernel {:@id=>6028, :@name=>"CentOS 7 x64 vmlinuz-3.10.0-229.20.1.el7.x86_64", :@version=>"3.10.0-229.20.1.el7.x86_64"}>, :@size_slug=>"2gb", :@user_data=>nil}>
[23] pry(main)> server.ssh_keys
=> nil
[24] pry(main)> 
phillbaker commented 8 years ago

Ah, @shortdudey123 thanks for opening an issue. We don't include the ssh keys that a droplet was created with in the response, which is why it returns nil. Since ssh keys can be modified on the host post creating a droplet, the keys that a droplet was created with may quickly/immediately be out of date.

However, the keys should have been used to actually create the droplet - with valid ids/keys does that call work?

shortdudey123 commented 8 years ago

does that call work?

No, the ssh_keys array does not get passed onto the droplet for creation. I am unable to ssh to the new droplet. If i create one through the UI using the same ssh key it works.

CloudCowboyCo commented 8 years ago

Hey @shortdudey123,

This is what I use to pull my premade keys from DO. https://gist.github.com/CloudCowboyCo/5eb34bec6ecc9a68f249 I couldn't tell if you were trying to hand off the key on creation.

If you are trying to add a key that you haven't added to DO previously you could accomplish this inside the userdata section of the droplet kit.

edit: put the wrong username :P

shortdudey123 commented 8 years ago

@CloudCowboyCo the ssh key i am trying to pass in the array already exists on DO. That is verified with the first line of the output in my original post :)

CloudCowboyCo commented 8 years ago

Sorry, I saw the value as nil when returned. I should have read more closely. This is how I accomplish spinning up a droplet with my keys already on the system https://github.com/CloudCowboyCo/do-cocaine/blob/master/droplet_deploy.rb Let me know if this helps you out.

shortdudey123 commented 8 years ago

Thanks, i am using the knife-digital_ocean gem to create the droplets (already ruled that out since i know that the ssh key is being passed to stuff in the droplet_kit gem)

phillbaker commented 8 years ago

@shortdudey123 our logs show that the API request was received with a ssh key and that the droplet was created with it.

It might help to open a ticket with some of the details (https://cloud.digitalocean.com/support/tickets/new) and we can help you debug further.

shortdudey123 commented 8 years ago

Done, thanks

jwadolowski commented 8 years ago

Hey @shortdudey123 - did you find out what was the problem? Out of the blue I started to see exactly the same issues with kitchen-digitalocean (it uses droplet_kit under the hood).

shortdudey123 commented 8 years ago

@jwadolowski nope, by the time digital ocean support got back to be I was unable to reproduce the issue

phillbaker commented 8 years ago

@jwadolowski what are the symptoms you're seeing? Don't believe this is a droplet kit specific issue: we've verified that DropletKit is passing the correct params and Droplets are being created with ssh keys.

Note that after a droplet (VM) is created, it may take some time for the OS and sshd to boot. If you wait for a bit and retry a connection to a droplet does that ever work?

shortdudey123 commented 8 years ago

one thing i had thought about doing, but never did was to do a root password reset then use that to login through the web shell interface and see if the pub key is in the ~/.ssh/authorized_keys file

jwadolowski commented 8 years ago

@phillbaker actually symptoms are the same as originally reported by @shortdudey123 - SSH key is not passed through to my droplet.

It's definitely not the case related to boot time and sshd start as retry is already implemented in test-kitchen itself and it worked fine for months.

To narrow it down a little bit I wrote and executed this (it is essentially what happens in kitchen-digitalocean):

require 'droplet_kit'

client = DropletKit::Client.new(access_token: ENV['DIGITALOCEAN_ACCESS_TOKEN'])
client.ssh_keys.find(id: ENV['DIGITALOCEAN_SSH_KEY_IDS'])
droplet = DropletKit::Droplet.new(
    name: 'droplet-kit-test',
    image: 'centos-7-0-x64',
    size: '1gb',
    region: 'fra1',
    ssh_key_ids: ENV['DIGITALOCEAN_SSH_KEY_IDS'],
    private_networking: true, 
    ipv6: false
)
d = client.droplets.create(droplet)
d.ssh_keys

Last line returns nil.

Env variables:

DIGITALOCEAN_ACCESS_TOKEN=xxxxxx
DIGITALOCEAN_SSH_KEY_IDS=yyyyyy
jwadolowski commented 8 years ago

Good point @shortdudey123. I've just logged in and root's ~/.ssh/authorized_keys file is empty.

$ ssh root@46.101.X.X
root@46.101.X.X's password: 
You are required to change your password immediately (root enforced)
Last login: Mon Mar  7 19:38:59 2016 from X.Y.Z
Changing password for root.
(current) UNIX password: 
New password: 
Retype new password: 
[root@droplet-kit-test ~]# cat .ssh/authorized_keys 

[root@droplet-kit-test ~]#
phillbaker commented 8 years ago

Thanks for checking the authorized keys file @jwadolowski, that's helpful. I can escalate that issue with our internal team.

Note that we don't include the ssh keys that a droplet was created with in the response, which is why .ssh_keys returns nil. Since ssh keys can be modified on the host post creating a droplet, the keys that a droplet was created with may quickly/immediately be out of date.

jwadolowski commented 8 years ago

Thanks @phillbaker! Looking forward to an update.

shortdudey123 commented 8 years ago

Reopening due to @jwadolowski being able to replicate this issue

phillbaker commented 8 years ago

@jwadolowski @shortdudey123 some follow up that would help us debug:

jwadolowski commented 8 years ago

Are there any indications in the cloud-init logs on the droplets that something there might be amiss?

Unfortunately not. Here's the cloud-init.log from a brand new droplet I've just created: https://gist.github.com/jwadolowski/6819d85d0bf8fd725d76

Does this happen on any distro other than centos?

Yes, just used debian-7-0-x64 - same effect. However I've noticed it's quite hard to reproduce on centos-6-5-x64. Haven't seen any errors for this distro yet.

Are there any reports of it happening outside of a client library - does this happen if you curl these requests?

Didn't check that, but will do shortly.

Can you give an example of how you're using kitcken-digital_ocean or knife_digital_ocean? A reproduction case would help us debug.

When it comes to kitchen-digitalocean this is the shortest way to reproduce this issue:

---
<% chef_versions = %w( 12 ) %>
<% platforms = %w( centos-6-5-x64 centos-7-0-x64 debian-7-0-x64 ) %>

driver:
  name: digitalocean

provisioner:
  name: chef_zero

platforms:
<% platforms.each do |p| %>
<%   chef_versions.each do |chef_version| %>
  - name: <%= p %>-chef-<%= chef_version %>
    driver:
      image: <%= p %>
    driver_config:
      region: fra1
      size: 1gb
      require_chef_omnibus: <%= chef_version %>
<%   end %>
<% end %>

suites:
  - name: default
    run_list:
      - recipe[do_test::default]

Right after that please execute kitchen converge default-centos-7-0-x64-chef-12 - it will create a brand new droplet.

Output I see at the moment:

$ kitchen converge default-centos-7-0-x64-chef-12
-----> Starting Kitchen (v1.4.2)
-----> Creating <default-centos-7-0-x64-chef-12>...
       Digital Ocean instance <11814927> created.
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
root@46.101.X.Y's password: 

Correct one looks like this:

$ kitchen converge default-centos-6-5-x64-chef-12
-----> Starting Kitchen (v1.4.2)
-----> Creating <default-centos-6-5-x64-chef-12>...
       Digital Ocean instance <11814990> created.
       Waiting for SSH service on 46.101.X.Y1:22, retrying in 3 seconds
       Waiting for SSH service on 46.101.X.Y1:22, retrying in 3 seconds
       Waiting for SSH service on 46.101.X.Y1:22, retrying in 3 seconds
       [SSH] Established
       (ssh ready)

       Finished creating <default-centos-6-5-x64-chef-12> (0m57.88s).
-----> Converging <default-centos-6-5-x64-chef-12>...
$$$$$$ Running legacy converge for 'Digitalocean' Driver
       Preparing files for transfer
       Preparing dna.json
       Preparing current project directory as a cookbook
       Removing non-cookbook files before transfer
       Preparing validation.pem
       Preparing client.rb
-----> Installing Chef Omnibus (12)
       Downloading https://www.chef.io/chef/install.sh to file /tmp/install.sh
       Trying curl...
       Download complete.
       Getting information for chef stable 12 for el...
       downloading https://omnitruck-direct.chef.io/stable/chef/metadata?v=12&p=el&pv=6&m=x86_64
         to file /tmp/install.sh.1317/metadata.txt
       trying curl...
       url  https://opscode-omnibus-packages.s3.amazonaws.com/el/6/x86_64/chef-12.7.2-1.el6.x86_64.rpm
       md5  8c3ba2e797fc852fc557b0e7157556cc
       sha256   6af0eb1c7706fc6a36f74ae9f590135e37e6206f2fe7d5a1760c1e2da1b36068
       downloaded metadata file looks valid...
       downloading https://opscode-omnibus-packages.s3.amazonaws.com/el/6/x86_64/chef-12.7.2-1.el6.x86_64.rpm
         to file /tmp/install.sh.1317/chef-12.7.2-1.el6.x86_64.rpm
       trying curl...
...

Totally strange thing about it is that it works kinda non deterministic. Sometimes it works, sometimes it doesn't. I've been using the same flow for months (if not years now) without any issues. Yesterday, out of the blue, it just stopped working.

If you're able to reproduce this again, opening a ticket in our support system for that specific droplet and calling out that it's related to "ssh keys not appearing to root authorized_keys file" will help us tag the droplet and investigate more.

Actually I've raised one already (ticket ID: 966628) but it refers to droplet that no longer exists. Just created another one for affected droplets that are still running (ticket ID: 968368).

jwadolowski commented 8 years ago

Are there any reports of it happening outside of a client library - does this happen if you curl these requests?

Here's the curl command I just executed:

curl \
    -X POST \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer ${DIGITALOCEAN_ACCESS_TOKEN}" \
    -d "{ \"name\":\"curl-test\", \"region\":\"fra1\", \"size\":\"1gb\", \"image\":\"centos-7-0-x64\",\"ssh_keys\":[\"${DIGITALOCEAN_SSH_KEY_IDS}\"], \"ipv6\":false, \"private_networking\":true}" \
    "https://api.digitalocean.com/v2/droplets"

and surprisingly... it works perfectly fine.

phillbaker commented 8 years ago

I've been able to reproduce this issue using centos-7-0-x64 in fra1.

Something we've been digging into is looking at /var/log/cloud-init.log on boxes that fail to ssh properly containing a warning similar to util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceDigitalOcean.DataSourceDigitalOcean'> failed. We've verified that in these cases we don't see requests to the metadata service, which might point to a problem with Centos' cloud-init implementation itself.

Since both reports are on Centos 7, can you try reproducing this on other versions of Centos or other distributions?

jwadolowski commented 8 years ago

Sure thing. Will get back to you with test results as soon as possible.

jwadolowski commented 8 years ago

Just wrote a couple of scripts to make testing easier.

Here's droplet_kit code I used to create 21 droplets:

require 'droplet_kit'

amount = 3
images = %w(
  centos-5-8-x64
  centos-6-5-x64
  centos-7-0-x64
  fedora-22-x64
  debian-7-0-x64
  debian-8-x64
  ubuntu-14-04-x64
)

images.each do |image|
  for i in 1..amount
    client = DropletKit::Client.new(
      access_token: ENV['DIGITALOCEAN_ACCESS_TOKEN']
    )
    droplet = DropletKit::Droplet.new(
        name: "droplet-kit-#{image}-#{i}",
        image: image,
        size: '512mb',
        region: 'fra1',
        ssh_key_ids: ENV['DIGITALOCEAN_SSH_KEY_IDS'],
        private_networking: true,
        ipv6: false
    )
    client.droplets.create(droplet)
  end
end

Script I've used for testing purposes:

#!/usr/bin/env bash

tugboat droplets | cut -d' ' -f1,3 | tr -d ',' | while read line; do
    droplet_name=$(echo ${line} | cut -d' ' -f1)
    droplet_ip=$(echo ${line} | cut -d' ' -f2)
    ssh_out=$(ssh -o StrictHostKeyChecking=no -oBatchMode=yes -l root $droplet_ip 'uname -a' 2>/dev/null; echo $?)

    echo "${droplet_name}: $ssh_out"
done

According to ssh man page:

ssh exits with the exit status of the remote command or with 255 if an error occurred.

Output:

$ ./tester.sh
droplet-kit-centos-5-8-x64-1: 255
droplet-kit-centos-5-8-x64-2: 255
droplet-kit-centos-5-8-x64-3: 255
droplet-kit-centos-6-5-x64-1: 255
droplet-kit-centos-6-5-x64-2: 255
droplet-kit-centos-6-5-x64-3: 255
droplet-kit-centos-7-0-x64-1: 255
droplet-kit-centos-7-0-x64-2: 255
droplet-kit-centos-7-0-x64-3: 255
droplet-kit-fedora-22-x64-1: 255
droplet-kit-fedora-22-x64-2: 255
droplet-kit-fedora-22-x64-3: 255
droplet-kit-debian-7-0-x64-1: 255
droplet-kit-debian-7-0-x64-2: 255
droplet-kit-debian-7-0-x64-3: 255
droplet-kit-debian-8-x64-1: 255
droplet-kit-debian-8-x64-2: 255
droplet-kit-debian-8-x64-3: 255
droplet-kit-ubuntu-14-04-x64-1: 255
droplet-kit-ubuntu-14-04-x64-2: 255
droplet-kit-ubuntu-14-04-x64-3: 255

Unfortunately my SSH key was not injected into any of these droplets. To be 100% sure I've cherry picked a few of them, but my script was right:

$ ssh root@46.101.X.8
root@46.101.X.8's password:

$ ssh root@46.101.Y.91
root@46.101.Y.91's password:

$ ssh root@46.101.Z.240
root@46.101.Z.240's password:
jwadolowski commented 8 years ago

Same procedure, but pure curl this time:

#!/usr/bin/env bash

images=("centos-5-8-x64" "centos-6-5-x64" "centos-7-0-x64" "fedora-22-x64" "debian-7-0-x64" "debian-8-x64" "ubuntu-14-04-x64")

for i in ${images[@]}; do
    curl \
        -X POST \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer ${DIGITALOCEAN_ACCESS_TOKEN}" \
        -d "{ \"name\":\"curl-${i}\", \"region\":\"fra1\", \"size\":\"512mb\", \"image\":\"${i}\",\"ssh_keys\":[\"${DIGITALOCEAN_SSH_KEY_IDS}\"], \"ipv6\":false, \"private_networking\":true}" \
        "https://api.digitalocean.com/v2/droplets"
done

Had to update my test script a little bit to prevent ssh from exiting too early:

#!/usr/bin/env bash

tugboat droplets | cut -d' ' -f1,3 | tr -d ',' | while read line; do
    droplet_name=$(echo ${line} | cut -d' ' -f1)
    droplet_ip=$(echo ${line} | cut -d' ' -f2)
    ssh_out=$(ssh -n -o StrictHostKeyChecking=no -oBatchMode=yes -l root $droplet_ip 'uname -a' 2>/dev/null >/dev/null; echo $?)

    echo "${droplet_name}: $ssh_out"
done

Final output:

$ ./tester.sh
droplet-kit-centos-5-8-x64-1: 255
droplet-kit-centos-5-8-x64-2: 255
droplet-kit-centos-5-8-x64-3: 255
droplet-kit-centos-6-5-x64-1: 255
droplet-kit-centos-6-5-x64-2: 255
droplet-kit-centos-6-5-x64-3: 255
droplet-kit-centos-7-0-x64-1: 255
droplet-kit-centos-7-0-x64-2: 255
droplet-kit-centos-7-0-x64-3: 255
droplet-kit-fedora-22-x64-1: 255
droplet-kit-fedora-22-x64-2: 255
droplet-kit-fedora-22-x64-3: 255
droplet-kit-debian-7-0-x64-1: 255
droplet-kit-debian-7-0-x64-2: 255
droplet-kit-debian-7-0-x64-3: 255
droplet-kit-debian-8-x64-1: 255
droplet-kit-debian-8-x64-2: 255
droplet-kit-debian-8-x64-3: 255
droplet-kit-ubuntu-14-04-x64-1: 255
droplet-kit-ubuntu-14-04-x64-2: 255
droplet-kit-ubuntu-14-04-x64-3: 255
curl-centos-5-8-x64: 0
curl-centos-6-5-x64: 0
curl-centos-7-0-x64: 0
curl-fedora-22-x64: 0
curl-debian-7-0-x64: 0
curl-debian-8-x64: 0
curl-ubuntu-14-04-x64: 0

Did some manual login attempts to confirm that's true and indeed I was able to log in to every single droplet created by curl:

$ ssh root@46.101.X.54
[root@curl-centos-6-5-x64 ~]# logout
Connection to 46.101.X.54 closed.
$ ssh root@46.101.Y.239
Last login: Wed Mar  9 16:52:45 2016 from x.x.x
[root@curl-centos-7-0-x64 ~]# logout
Connection to 46.101.Y.239 closed.
$ ssh root@46.101.Z.190
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 3.13.0-79-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

  System information as of Wed Mar  9 16:51:49 EST 2016

  System load:  0.0               Processes:           68
  Usage of /:   7.8% of 19.56GB   Users logged in:     0
  Memory usage: 11%               IP address for eth0: 46.101.Z.190
  Swap usage:   0%                IP address for eth1: 10.135.8.142

  Graph this data and manage this system at:
    https://landscape.canonical.com/

0 packages can be updated.
0 updates are security updates.

root@curl-ubuntu-14-04-x64:~# logout
Connection to 46.101.Z.190 closed.
shortdudey123 commented 8 years ago

@jwadolowski glad you were able to reproduce this! thanks for doing the extended testing to verify it

phillbaker commented 8 years ago

@jwadolowski I think the parameter name is ssh_keys not ssh_key_ids?

So this looks like it wouldn't work?

ssh_key_ids: ENV['DIGITALOCEAN_SSH_KEY_IDS'],

Would you mind updating and trying one more time?

jwadolowski commented 8 years ago

Sorry about that, too much copying and pasting.

Updated Ruby code now contains correct hash key and mimics the same logic as in kitchen-digitalocean:

require 'droplet_kit'

amount = 3
images = %w(
  centos-5-8-x64
  centos-6-5-x64
  centos-7-0-x64
  fedora-22-x64
  debian-7-0-x64
  debian-8-x64
  ubuntu-14-04-x64
)

images.each do |image|
  for i in 1..amount
    client = DropletKit::Client.new(
      access_token: ENV['DIGITALOCEAN_ACCESS_TOKEN']
    )
    droplet = DropletKit::Droplet.new(
        name: "droplet-kit-#{image}-#{i}",
        image: image,
        size: '512mb',
        region: 'fra1',
        ssh_keys: ENV['DIGITALOCEAN_SSH_KEY_IDS'].to_s.split(/, ?/),
        private_networking: true,
        ipv6: false
    )
    client.droplets.create(droplet)
  end
end

Run that a few times, but all was good all the time:

$ ./tester.sh
droplet-kit-centos-5-8-x64-1: 0
droplet-kit-centos-5-8-x64-2: 0
droplet-kit-centos-5-8-x64-3: 0
droplet-kit-centos-6-5-x64-1: 0
droplet-kit-centos-6-5-x64-2: 0
droplet-kit-centos-6-5-x64-3: 0
droplet-kit-centos-7-0-x64-1: 0
droplet-kit-centos-7-0-x64-2: 0
droplet-kit-centos-7-0-x64-3: 0
droplet-kit-fedora-22-x64-1: 0
droplet-kit-fedora-22-x64-2: 0
droplet-kit-fedora-22-x64-3: 0
droplet-kit-debian-7-0-x64-1: 0
droplet-kit-debian-7-0-x64-2: 0
droplet-kit-debian-7-0-x64-3: 0
droplet-kit-debian-8-x64-1: 0
droplet-kit-debian-8-x64-2: 0
droplet-kit-debian-8-x64-3: 0
droplet-kit-ubuntu-14-04-x64-1: 0
droplet-kit-ubuntu-14-04-x64-2: 0
droplet-kit-ubuntu-14-04-x64-3: 0
curl-centos-5-8-x64: 0
curl-centos-6-5-x64: 0
curl-centos-7-0-x64: 0
curl-fedora-22-x64: 0
curl-debian-7-0-x64: 0
curl-debian-8-x64: 0
curl-ubuntu-14-04-x64: 0

Did yet another try and created droplet using test kitchen:

$ kitchen converge server-centos-7-0-x64-chef-12 -l debug
<cut>
-----> Creating <server-centos-7-0-x64-chef-12>...
D      digitalocean:name servercentos70x-kuba-nysa-awv1te7
D      digitalocean:imagecentos-7-0-x64
D      digitalocean:size 1gb
D      digitalocean:region fra1
D      digitalocean:ssh_key_ids <SSH_KEY_ID>
D      digitalocean:private_networking true
D      digitalocean:ipv6 false
D      digitalocean:user_data 
D      digitalocean_api_key <API_KEY>
       Digital Ocean instance <DROPLET_ID> created.
D      digitalocean_api_key <API_KEY>
D      digitalocean_api_key <API_KEY>
D      [SSH] opening connection to root@46.101.X.Y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>true, :compression_level=>6, :keepalive=>true, :keepalive_interval=>60, :timeout=>15}>
D      [SSH] connection failed (#<Timeout::Error: execution expired>)
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
D      [SSH] opening connection to root@46.101.X.Y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>true, :compression_level=>6, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Timeout::Error: execution expired>)
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
D      [SSH] opening connection to root@46.101.X.Y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>true, :compression_level=>6, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for "46.101.X.Y" port 22>)
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
D      [SSH] opening connection to root@46.101.X.Y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>true, :compression_level=>6, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
root@46.101.X.Y's password:

Decided to run my test just for this droplet and this happened:

$ ./tester.sh 
servercentos70x-kuba-nysa-sjvlc0c: 0
$ ssh root@46.101.X.Y
Last login: Thu Mar 10 12:35:53 2016 from x.x.x.x
[root@servercentos70x-kuba-nysa-sjvlc0c ~]#

Since it happens mostly on centos-7-0-x64 I gave it yet another try (30 droplets, just CentOS 7):

$ ./tester.sh 
droplet-kit-centos-7-0-x64-1: 0
droplet-kit-centos-7-0-x64-2: 0
droplet-kit-centos-7-0-x64-3: 0
droplet-kit-centos-7-0-x64-4: 0
droplet-kit-centos-7-0-x64-5: 0
droplet-kit-centos-7-0-x64-6: 0
droplet-kit-centos-7-0-x64-7: 0
droplet-kit-centos-7-0-x64-8: 0
droplet-kit-centos-7-0-x64-9: 0
droplet-kit-centos-7-0-x64-10: 0
droplet-kit-centos-7-0-x64-11: 0
droplet-kit-centos-7-0-x64-12: 255
droplet-kit-centos-7-0-x64-13: 0
droplet-kit-centos-7-0-x64-14: 0
droplet-kit-centos-7-0-x64-15: 255
droplet-kit-centos-7-0-x64-16: 0
droplet-kit-centos-7-0-x64-17: 0
droplet-kit-centos-7-0-x64-18: 0
droplet-kit-centos-7-0-x64-19: 0
droplet-kit-centos-7-0-x64-20: 0
droplet-kit-centos-7-0-x64-21: 0
droplet-kit-centos-7-0-x64-22: 0
droplet-kit-centos-7-0-x64-23: 0
droplet-kit-centos-7-0-x64-24: 0
droplet-kit-centos-7-0-x64-25: 0
droplet-kit-centos-7-0-x64-26: 0
droplet-kit-centos-7-0-x64-27: 255
droplet-kit-centos-7-0-x64-28: 0
droplet-kit-centos-7-0-x64-29: 0
droplet-kit-centos-7-0-x64-30: 0

This means something may be wrong with test-kitchen or kitchen-digitalocean, but at the same time it also proves that even bare droplet_kit can fail sometimes.

The fact we've been using that for months without issues concerns me most :) How did it happen that a few days ago we just stumbled upon such problems?

Will let kitchen-digitalocean maintainer know about that.

phillbaker commented 8 years ago

@jwadolowski thanks for the update! We're investigating issues around cloud-init and centos7 specifically, so there may be something there.

jwadolowski commented 8 years ago

@phillbaker, I did a few more tests (results are available here) and indeed it seems to be CentOS 7 specific.

jwadolowski commented 8 years ago

@phillbaker did you guys manage to track this bug down? It's kinda annoying that we have reiterate a couple of times before we get usable droplet.

jwadolowski commented 7 years ago

Hi @phillbaker,

I managed to partially solve the problem (at least the part related to Test Kitchen DigitalOcean driver), but something's still wrong (possibly with cloud-init). You can find all the details here: https://github.com/test-kitchen/kitchen-digitalocean/issues/45#issuecomment-303981007

Droplet that got stuck is still running and I'm going to raise a ticket in DO portal, so you can take a look at that.

phillbaker commented 7 years ago

@jwadolowski, thanks for the followup. @kitschysynq is probably the best person to contact about that.