chef-boneyard / chef-provisioning

A library for creating machines and infrastructures idempotently in Chef.
Apache License 2.0
524 stars 163 forks source link

machine_batch with embedded loops and dynamic machine_options causes unexpected results #438

Open redterror opened 9 years ago

redterror commented 9 years ago

First, apologies if this is less than coherent as I've gone slightly insane (and pulled out half my beard) debugging this.

I'm trying to batch-create 3 servers, each in a different AWS subnet, using the aws driver. This machine-in-machine_batch style produces the correct log output from the box_opts call, but the log output from the driver's run_instances call is identical for each server:

rabbits = []
machine_batch "rabbit cluster" do
  1.upto(3) do |i|
    tmp = machine "#{label}-east-rabbit0#{i}" do
      role              'base'
      role              'rabbitmq-v2'
      chef_environment  'perf2'
      machine_options   lazy { MyBoxConfig.box_opts('rabbit', i, label, subnets) }
    end
    rabbits.push(tmp)
  end
  action    :converge
end

This one does work:

rabbits = []
1.upto(3) do |i|
  tmp = machine "#{label}-east-rabbit0#{i}" do
    role              'base'
    role              'rabbitmq-v2'
    chef_environment  'perf2'
    machine_options   lazy { MyBoxConfig.box_opts('rabbit', i, label, subnets) }
    action            :ready
  end
  rabbits.push(tmp)
end

machine_batch "rabbit cluster" do
  action    :converge_only
  machines  rabbits
end

I tried without the lazy evaluation as well, but that didn't change anything. What additional details can I provide to help debug this? Is this a known limitation / issue with machine_batch? Am I doing something wrong?

tyler-ball commented 9 years ago

Hey @redterror - thanks for the bug report. I added this to the list of bugs to fix for the EPIC #429

There is something super weird and buggy going on with machine_batch where it uses the same machine_options for all the machines in the batch. Its definitely a bug that we need to fix.

redterror commented 9 years ago

Thanks. I attempted to make a failing spec for this, and was disappointed when things worked right! Here's my modification to machine_batch_spec.rb that passes where I expected it to fail:

require 'spec_helper'

describe Chef::Resource::MachineBatch do
  extend AWSSupport

  when_the_chef_12_server "exists", organization: 'foo', server_scope: :context do
    with_aws "with a VPC and a public subnet" do

      before :all do
        chef_config[:log_level] = :warn
      end

      purge_all
      setup_public_vpc

      azs = []
      driver.ec2.availability_zones.each {|az| azs << az }
      az = azs[1].name
      aws_subnet 'test_subnet2' do
        vpc 'test_vpc'
        cidr_block '10.0.1.0/24'
        availability_zone az
        map_public_ip_on_launch true
      end

      it "machine_batch creates multiple machines", :super_slow do
        expect_recipe {
          machine_batch 'test_machines' do
            action :allocate
            (1..3).each do |i|
              machine "test_machine#{i}" do
                machine_options bootstrap_options: {
                  subnet_id: 'test_public_subnet',
                  key_name: 'test_key_pair'
                }, source_dest_check: false
              end
            end
          end
        }.to create_an_aws_instance('test_machine1',
          source_dest_check: false
        ).and create_an_aws_instance('test_machine2',
          source_dest_check: false
        ).and create_an_aws_instance('test_machine3',
          source_dest_check: false
        ).and be_idempotent
      end

      it "machine_batch supports runtime machine_options", :super_slow do
        expect_recipe {
          subnets = %w(test_public_subnet test_subnet2)

          machine_batch 'test_machines' do
            action :allocate
            (1..2).each do |i|
              machine "test_machine#{i}" do
                machine_options bootstrap_options: {
                  subnet_id: subnets[i-1],
                  key_name: 'test_key_pair'
                }, source_dest_check: (i == 1)
              end
            end
          end
        }.to create_an_aws_instance('test_machine1',
          subnet_id: test_public_subnet.aws_object.id,
          source_dest_check: true
        ).and create_an_aws_instance('test_machine2',
          subnet_id: test_subnet2.aws_object.id,
          source_dest_check: false
        ).and be_idempotent
      end
    end

  end
end

I can't say definitively that the problem isn't in my helper library code, but either way it sure was an unexpected result.

Edit: If its helpful am happy to make a PR w/ that spec change.

tyler-ball commented 9 years ago

Well, that is super weird. So the test succeeds but if you put that code block in a real recipe it fails? And yes, please submit a PR with the updated test. More testing = more better!