chef-boneyard / chef-provisioning

A library for creating machines and infrastructures idempotently in Chef.
Apache License 2.0
524 stars 164 forks source link

SocketError getaddrinfo: nodename nor servname provided, or not known #330

Open Joseph-R opened 9 years ago

Joseph-R commented 9 years ago

Chef Zero runs fail with the error SocketError getaddrinfo: nodename nor servname provided, or not known.

It looks like before that there is a INFO: HTTP Request Returned 404 Not Found : Object not found: http://localhost:8889/nodes/tester, where tester = my machine name.

See full stack trace here.

To reproduce: 1) Install Chef DK release candidate 5 2) Run chef-client -z simple.rb, where simple.rb looks something like the recipe posted below.

Recipe:

require 'chef/provisioning/aws_driver'

with_driver 'aws::us-east-1'

# Public subnet
machine 'tester' do
  tag 'test_successful'
  recipe 'hdp::hello_world'
  machine_options :bootstrap_options => {
    :key_name => 'korrelate2012',
    :image_id => 'ami-b23009da',
    :subnet => 'subnet-aebab1da',
    :security_groups => [ 'hortonworks_chef' ],
    :instance_type => 'm3.medium',
    :associate_public_ip_address => true
  }
end

knife.rb

current_dir = File.dirname(__FILE__)
log_level                :info
log_location             STDOUT
node_name                "jreid.local"
chef_zero.enabled    true
local_mode       true
client_key               "/Users/jreid/.ssh/korrelate2012.pem"
chef_server_url          "http://localhost:8889"
cache_type               'BasicFile'
cache_options( :path => "#{ENV['HOME']}/.chef/checksums" )
cookbook_path            [ "/Users/jreid/repo/chef-repo/cookbooks", "/Users/jreid/repo/chef-repo/site-cookbooks", "/Users/jreid/repo/chef-repo/berks-cookbooks" ]

# Amazon AWS
knife[:aws_access_key_id] = ENV['AWS_ACCESS_KEY_ID']
knife[:aws_secret_access_key] = ENV['AWS_SECRET_ACCESS_KEY']

# Bootstrap with Chef-Zero
knife[:ssh_user] = "ec2-user"
chef_zero[:port] = "8889"

Versions: (Chef DK 5 release candidate for Mac OSX, per https://github.com/chef/chef-provisioning/issues/322)

chef (12.2.1)
chef-dk (0.5.0.rc.5)
chef-provisioning (1.0.1)
chef-provisioning-aws (1.0.4, 1.0.3)
chef-provisioning-azure (0.3.2)
chef-provisioning-fog (0.13.2)
chef-provisioning-vagrant (0.8.3)
chef-vault (2.4.0)
chef-zero (4.2.1, 1.5.6)
cheffish (1.1.2, 1.1.0)
chefspec (4.2.0)

It looks like it's not creating the node resource it needs on the Chef-Zero instance before it attempts to execute the rest of the resource. Can anyone confirm/deny?

@tyler-ball suggested changing with_driver from aws to aws::us-east-1. Is there anything else in the recipe itself which needs to change?

jkeiser commented 9 years ago

@JoeReid-Korrelate can you post the stack trace that appears at /Users/jreid/repo/chef-repo/.chef/local-mode-cache/cache/chef-stacktrace.out ? It would help to tell from whence that error comes.

tyler-ball commented 9 years ago

@JoeReid-Korrelate Can you try commenting out chef_zero[:port] = "8889" and chef_server_url "http://localhost:8889" from your knife.rb?

Semisonic8100 commented 9 years ago

Hey guys! I appreciate your input.

@jkeiser - Sure. Full output posted here.

@tyler-ball - Still getting the 404 after commenting out those two lines in my knife.rb

Current knife.rb:

current_dir = File.dirname(__FILE__)
log_level                :info
log_location             STDOUT
node_name                "jreid.local"
chef_zero.enabled    true
local_mode       true
client_key               "/Users/jreid/.ssh/korrelate2012.pem"
# validation_client_name   "chef-validator"
# validation_key           "#{current_dir}/ORGANIZATION-validator.pem"
#chef_server_url          "http://localhost:8889"
cache_type               'BasicFile'
cache_options( :path => "#{ENV['HOME']}/.chef/checksums" )
cookbook_path            [ "/Users/jreid/repo/chef-repo/cookbooks", "/Users/jreid/repo/chef-repo/site-cookbooks", "/Users/jreid/repo/chef-repo/berks-cookbooks" ]

# Amazon AWS
knife[:aws_access_key_id] = ENV['AWS_ACCESS_KEY_ID']
knife[:aws_secret_access_key] = ENV['AWS_SECRET_ACCESS_KEY']

# Client version
#knife[:bootstrap_version] = "12.2.1"

# Bootstrap with Chef-Zero
knife[:ssh_user] = "ec2-user"
#chef_zero[:port] = "8889"

And here is the stack trace from that run. No notable changes. Anything else in my knife config that could be causing this?

tyler-ball commented 9 years ago

Ah! Okay, looking at the stack trace, I don't think the issue is chef-zero. The issue seems to be connecting to the AWS endpoint.

This block sets up the AWS configuration used. Could you add a require pry; binding.pry statement after that line and look at the AWS config object? Perhaps it is getting a bad value for region or configuration somehow. Send me a direct message on Gitter if you would like to get together to troubleshoot this.

Semisonic8100 commented 9 years ago

@tyler-ball - Done.

From: /Users/jreid/.chefdk/gem/ruby/2.1.0/gems/chef-provisioning-aws-1.0.4/lib/chef/provisioning/aws_driver/driver.rb @ line 59 Chef::Provisioning::AWSDriver::Driver#initialize:

    45: def initialize(driver_url, config)
    46:   super
    47:
    48:   _, profile_name, region = driver_url.split(':')
    49:   profile_name = nil if profile_name && profile_name.empty?
    50:   region = nil if region && region.empty?
    51:
    52:   credentials = profile_name ? aws_credentials[profile_name] : aws_credentials.default
    53:   @aws_config = AWS.config(
    54:     access_key_id:     credentials[:aws_access_key_id],
    55:     secret_access_key: credentials[:aws_secret_access_key],
    56:     region: region || credentials[:region],
    57:     logger: Chef::Log.logger
    58:   )
 => 59:   require 'pry'; binding.pry
    60: end

[1] pry(#<Chef::Provisioning::AWSDriver::Driver>)> region
=> "us-east-1"
[2] pry(#<Chef::Provisioning::AWSDriver::Driver>)> credentials[:region]
=> "us-east-1a"

Afterwards, I modified the driver from simple.rb to use aws::us-east-1a instead of aws::us-east-1. Now the value of region and credentials[:region] match.

require 'chef/provisioning/aws_driver'

with_driver 'aws::us-east-1a'

# Public subnet
machine 'tester' do
  tag 'test_successful'
  recipe 'hdp::hello_world'
  machine_options :bootstrap_options => {
    :key_name => 'korrelate2012',
    :image_id => 'ami-b23009da',
    :subnet => 'subnet-aebab1da',
    :security_groups => [ 'hortonworks_chef' ],
    :instance_type => 'm3.medium',
    :associate_public_ip_address => true
  }
end

New pry output:

From: /Users/jreid/.chefdk/gem/ruby/2.1.0/gems/chef-provisioning-aws-1.0.4/lib/chef/provisioning/aws_driver/driver.rb @ line 59 Chef::Provisioning::AWSDriver::Driver#initialize:

    45: def initialize(driver_url, config)
    46:   super
    47:
    48:   _, profile_name, region = driver_url.split(':')
    49:   profile_name = nil if profile_name && profile_name.empty?
    50:   region = nil if region && region.empty?
    51:
    52:   credentials = profile_name ? aws_credentials[profile_name] : aws_credentials.default
    53:   @aws_config = AWS.config(
    54:     access_key_id:     credentials[:aws_access_key_id],
    55:     secret_access_key: credentials[:aws_secret_access_key],
    56:     region: region || credentials[:region],
    57:     logger: Chef::Log.logger
    58:   )
 => 59:   require 'pry'; binding.pry
    60: end

[1] pry(#<Chef::Provisioning::AWSDriver::Driver>)> region
=> "us-east-1a"
[2] pry(#<Chef::Provisioning::AWSDriver::Driver>)> credentials[:region]
=> "us-east-1a"

I also verified that credentials[:aws_access_key_id] and credentials[:aws_secret_access_key] have values. Although I do note that the aws_secret_access_key I'm using for my IAM account (with super user permissions) has a / in it, which has given us issues before.

Any reason to suspect that's the culprit?

Semisonic8100 commented 9 years ago

Swapped with_driver 'aws::us-east-1a' back to with_driver 'aws::us-east-1' in simple.rb, per our conversation on Gitter.

From @tyler-ball : do you have anything in your credentials which specifies endpoint or anything?

Nothing in my credentials which wasn't posted above. However I did recently run into an issue with DNS redirects to an AWS CNAME where I ran into nodename nor servname provided, or not known when attempting to SSH to an external FQDN in our VPC. So I think you may be onto something about the endpoint.

How can I fish the AWS end-point the code is trying to connect to out of the run?

tyler-ball commented 9 years ago

@Semisonic8100 We talked again on Gitter, and it seems like the problem may be because the AWS key has a slash in it. What direction was the slash? I would like to try and repro and then file a bug with AWS if this is the case - we just pass the credentials directly to their Ruby SDK.

tyler-ball commented 9 years ago

Seems like this is a known issue - @Semisonic8100 found http://stackoverflow.com/questions/14681938/invalid-hostname-error-when-connecting-to-s3-sink-when-using-secret-key-having-f

jgoulah commented 8 years ago

am currently seeing this (both solo and server mode)

jgoulah commented 8 years ago

update: I was getting this from passing region "us-east-1a" instead of "us-east-1"