chef-boneyard / chef-provisioning

A library for creating machines and infrastructures idempotently in Chef.
Apache License 2.0
524 stars 164 forks source link

Problem mixing 'machine_batch' with 'machine' #378

Open Maniacal opened 9 years ago

Maniacal commented 9 years ago

This is the test recipe

chef_gem 'chef-provisioning-aws' do
  version '1.2.1'
  compile_time true if respond_to?(:compile_time)
end

require 'chef/provisioning'
require 'chef/provisioning/aws_driver'

iam = AWS::Core::CredentialProviders::EC2Provider.new

# Use IAM as our profile in the driver name
with_driver(
  'aws:IAM:us-east-1',
  aws_credentials: { 'IAM' => iam.credentials },
)

machine_batch do
  1.upto(2) do |i|
    machine "test-#{i}" do
      chef_environment 'development'
      machine_options(
        ssh_username: 'ubuntu',
        bootstrap_options: {
          key_name: 'ds-ts',
          image_id: 'ami-d05e75b8',
          security_group_ids: 'sg-3d806b5a',
          subnet_id: 'subnet-602b8f4b',
          instance_type: 't2.micro'
        }
      )
    end
  end
end

1.upto(2) do |i|
  machine "test-#{i}"
end

The 'machine_batch' stuff completes successfully. Chef runs on each node with a blank runlist

<--snip--> 
    - [test-1] run 'bash -c ' bash /tmp/chef-install.sh'' on test-1
    [test-1] sudo: unable to resolve host ip-10-202-90-83
             Starting Chef Client, version 12.4.0
             resolving cookbooks for run list: []
             Synchronizing Cookbooks:
             Compiling Cookbooks...
             [2015-06-26T00:13:18+00:00] WARN: Node test-1 has an empty run list.
             Converging 0 resources

             Running handlers:
             Running handlers complete
             Chef Client finished, 0/0 resources updated in 1.282651139 seconds
<--snip-->

The next chef run seems to not have the 'machine_options'. I'm guessing this is the case because the error produced tells me that it can't find the 'key_name' so it is using the default value of 'chef_default'

  * machine[test-1] action converge

    ================================================================================
    Error executing action `converge` on resource 'machine[test-1]'
    ================================================================================

    AWS::EC2::Errors::UnauthorizedOperation
    ---------------------------------------
    aws_key_pair[chef_default] (basic_chef_client::block line 921) had an error: AWS::EC2::Errors::UnauthorizedOperation: You are not authorized to perform this operation.

I had assumed this was because the 'chef_provisioning' hash wasn't being saved in the node data for the nodes but as you can see below the node has the right data on it (specifically, the right 'key_name')

Resulting node data for one of the nodes

    "chef_provisioning": {
      "reference": {
        "driver_version": "1.2.1",
        "allocated_at": "2015-06-26 00:11:23 UTC",
        "host_node": "https://example.com/organizations/cf-data-services/nodes/",
        "image_id": "ami-d05e75b8",
        "instance_id": "i-1e5a6ab7",
        "key_name": "ds-ts",
        "ssh_username": "ubuntu"
      },
      "driver_url": "aws:IAM:us-east-1"
    }

If I remove the 'machine_batch ... end' wrapper and allow the first converge to happen in serial we don't see this issue

tyler-ball commented 9 years ago

Hey @Maniacal - thanks for the bug report. I added this to the list of bugs to fix for the EPIC https://github.com/chef/chef-provisioning/issues/429