chef-boneyard / chef-provisioning-aws

AWS driver and resources for Chef that uses the AWS SDK
Apache License 2.0
142 stars 121 forks source link

Machine image behavior has changed - intentional? #131

Open christinedraper opened 9 years ago

christinedraper commented 9 years ago

I have an existing AMI called appserver_image (not in the chef provisioning data bag).

I run a recipe:

machine_image "appserver_image" do
  # stuff
end

It generates an instance and then fails with error

AWS::EC2::Errors::InvalidAMIName::Duplicate: machine_image[appserver_image] (@recipe_files::/home/christine/provisioning/awsgen/aws.rb line 20) had an error: AWS::EC2::Errors::InvalidAMIName::Duplicate: AMI name appserver_image is already in use by AMI ami-97ba9aa7
/opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/aws-sdk-v1-1.61.0/lib/aws/core/client.rb:375:in `return_or_raise'
/opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/aws-sdk-v1-1.61.0/lib/aws/core/client.rb:476:in `client_request'
(eval):3:in `create_image'
/opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/aws-sdk-v1-1.61.0/lib/aws/ec2/image_collection.rb:184:in `create'
/home/christine/.chefdk/gem/ruby/2.1.0/gems/chef-provisioning-aws-0.4.0/lib/chef/provisioning/aws_driver/driver.rb:329:in `block in allocate_image'
/opt/chefdk/embedded/apps/chef/lib/chef/mixin/why_run.rb:52:in `call'
/opt/chefdk/embedded/apps/chef/lib/chef/mixin/why_run.rb:52:in `add_action'
/opt/chefdk/embedded/apps/chef/lib/chef/provider.rb:180:in `converge_by'
/home/christine/.chefdk/gem/ruby/2.1.0/gems/chef-provisioning-0.19/lib/chef/provisioning/chef_provider_action_handler.rb:54:in `perform_action'
/home/christine/.chefdk/gem/ruby/2.1.0/gems/chef-provisioning-aws-0.4.0/lib/chef/provisioning/aws_driver/driver.rb:324:in `allocate_image'
/home/christine/.chefdk/gem/ruby/2.1.0/gems/chef-provisioning-0.19/lib/chef/provider/machine_image.rb:65:in `create_image'
/home/christine/.chefdk/gem/ruby/2.1.0/gems/chef-provisioning-0.19/lib/chef/provider/machine_image.rb:35:in `block in <class:MachineImage>'

I'm fairly sure it used to not try to generate the machine image if it already exists.

I can see there might be more than one valid use case...

1) "Use existing image or create a new one if it doesnt exist" (this is what I think it used to be) 2) "Update existing image or create new one if it doesnt exist" 3) "Only update/create image if it doesnt already exist - fail if theres an existing image" (this is what it seems to be now)

jkeiser commented 9 years ago

Nope, it should not be regenerating the image. Mind trying latest master? I'm not seeing that behavior anymore but we've made a lot of changes (and bugfixes) in the branches I worked in, and that's all merged now,.

The intent of the resource is "if any of the inputs change--the cookbooks or run list--regenerate the image, otherwise leave it alone."

christinedraper commented 9 years ago

I'm still seeing the behavior where it tries to regenerate the image. Let me know if there's any info I can get to help debug.

christinedraper commented 9 years ago

Here's some more details.

What I actually did was:

1) Run recipe to create machine_image with AMI Name "appserver_image" 2) Delete chef data (i.e. data bag machine_image/appserver_image.json) 3) Rerun recipe - get failure due to attempt to create AMI with existing name

If I instead set the resource name to be the AMI ID, it will pick up the existing AMI, although it will first create an instance, which is a waste of time - it should really check for the existing AMI before that.

So there is a way to pick up an existing image, it's just I can't rerun the same recipe that originally created the image. Is it reasonable for it to check and match on AMI Name if there isnt an AMI ID match?

The other thing I encountered is that sometimes the AMI takes too long to become ready, and the recipe terminates without terminating the machine instance it created. So you get left with a running instance/node. Would it be reasonable for a subsequent recipe run to terminate it?

tyler-ball commented 9 years ago

My guess is the bug we're experiencing is because we only check the data_bag and not the source of truth (AWS). Because AMI Names are unique in a region we should be able to check on that.