hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.74k stars 9.1k forks source link

[Bug]: aws_cloudformation_stack_set - operation_preferences argument ignored #33170

Open michalz-rely opened 1 year ago

michalz-rely commented 1 year ago

Terraform Core Version

1.5.2

AWS Provider Version

5.0.0

Affected Resource(s)

aws_cloudformation_stack_set

Expected Behavior

Remove operation_preferences from aws_cloudformation_stack_set

Actual Behavior

operation_preferences defined in aws_cloudformation_stack_set are ignored, which leads to confusion.

Relevant Error/Panic Output Snippet

No response

Terraform Configuration Files

no error reported

Steps to Reproduce

First

Next

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

https://awscli.amazonaws.com/v2/documentation/api/latest/reference/cloudformation/create-stack-set.html https://awscli.amazonaws.com/v2/documentation/api/latest/reference/cloudformation/create-stack-instances.html

Would you like to implement a fix?

None

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

justinretzolk commented 1 year ago

Hey @michalz-rely 👋 Are you able to confirm via the AWS Console, API, or CLI that the setting is not being applied? Can you also supply a sample Terraform configuration that can be used to reproduce the issue, as well as debug logs (redacted as needed)?

michalz-rely commented 1 year ago

Hi @justinretzolk, I can make some more extensive testing, but here I share my recent work. (also tested and having same behaviour on 5.15.0)

As you can see, on Destroy action of aws_cloudformation_stack_set_instance, operation preferences got ignored. I'm observing the termination process at the moment, and it's removing the resources one account at a time.

andrejskuidins commented 1 year ago

Almost identical to: https://github.com/hashicorp/terraform-provider-aws/issues/30806

michalz-rely commented 1 year ago

@andrejskuidins I think it's not related, https://github.com/hashicorp/terraform-provider-aws/issues/30806 is caused by lack of multi-region deployment support https://github.com/hashicorp/terraform-provider-aws/issues/24752

michalz-rely commented 11 months ago

@justinretzolk is the information I provided enough evidence or you need more examples?

uakramm commented 11 months ago

Facing the same issue on apply as well. operation_preferences argument gets ignored.

evantlueck commented 10 months ago

terraform v1.5.2 hashicorp/aws v5.9.0

So there are two places where operation_preferences can be defined when it comes to stackset infrastructure. On the stack_set resource: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudformation_stack_set#operation_preferences-argument-reference and on the stack_set_instance resource: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudformation_stack_set_instance#operation_preferences-argument-reference

I found that the best thing to do is specify the same values for both entities (stack_set and stack_set_instance) or one will override the other which results in the issues you are seeing.

Additionally, in regards to operation_preferences, it's important to also give the same (count or percentage) value to failure_tolerance as well as max_concurrent. From testing in the aws console manually and via terraform, I found that failure_tolerance acts as a limiter which functionally overrides whatever value is passed to the max_concurrent value. Which doesn't make much logical sense, and is also not clear from the aws documentation either: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/stacksets-concepts.html#stackset-ops-options

I don't recommend using 100 for both failure_tolerance_percentage and max_concurrency_percentage though. Even though that results in the fastest deployment, your stackset will always return successful regardless of actual success. I found 50 50 or 49 49 was a safe strategy for speedy deployments/updates of infrastructure which will return a failure response to terraform if the deployment fails in all accounts. If you need terraform to return success/fail if the deployment fails in only one account though, you're stuck with sequential runtimes unfortunately. Deletion seems to always be sequential from terraform however.

Also, I found that haveing multiple stack_set_instance resources associated with a single stack_set resource (on a per-parent-ou basis like I have in this example here: https://github.com/hashicorp/terraform-provider-aws/issues/33785#issuecomment-1780233282 which I used to fix a number of other issues I found), will slow down the initial deployment speed of the stack_set, but won't impact future template updates after initial deployment. But again, this assumes that both stack_set and stack_set_instance resources have the operation_preferences defined.

michalz-rely commented 8 months ago

@evantlueck thanks for your message, but I think this needs to be sorted. I haven't tested your solution, but in general there's no need for setting parameters when defining a stack set, a stack set instance is taking the parameters, see the CLI documentation:

theipster commented 1 month ago

Just did some investigating, and I believe the underlying issue (if you can call it that, see my theory below) is that the CloudFormation's CreateStackSet API operation simply doesn't support the OperationPreferences parameter.

Interestingly, the UpdateStackSet operation does support the OperationPreferences parameter, and this is indeed already implemented within the Terraform provider (see stack_set.go#L422-L424). 🤔

(Side note: I also noticed that the new parameter operation_preferences.concurrency_mode was missed off the aws_cloudformation_stack_set resource during the implementation from https://github.com/hashicorp/terraform-provider-aws/pull/38498.)

So, a potential naive code fix might be to call resourceStackSetUpdate() immediately at the end of resourceStackSetCreate(). However, I'm guessing that the design rationale for not supporting OperationPreferences on the CreateStackSet operation is intentional, because when creating an empty stack set without any targets yet, there is no operation to execute.

For comparison, you can see that the CreateStackInstances, UpdateStackInstances and DeleteStackInstances API operations all support the OperationPreferences parameter. Furthermore, they are already implemented for resourceStackSetInstanceCreate() and resourceStackSetInstanceUpdate(), although not yet implemented for resourceStackSetInstanceDelete() (I'll create a separate bug report later).

This is why the aws_cloudformation_stack_set_instance.operation_preferences works for you, and to be honest it's probably the more semantically correct solution when creating new targets anyway.

Related: