aws / copilot-cli

The AWS Copilot CLI is a tool for developers to build, release and operate production ready containerized applications on AWS App Runner or Amazon ECS on AWS Fargate.
https://aws.github.io/copilot-cli/
Apache License 2.0
3.42k stars 398 forks source link

Environment stack did not complete successfully and exited with status ROLLBACK_FAILED #2719

Open Nikola-Milovic opened 2 years ago

Nikola-Milovic commented 2 years ago

After creating a new app and adding two services to it, I tried creating a test environment with copilot env init and I fail with this

  - An IAM Role for AWS CloudFormation to manage resources                         [delete skipped]         [15.7s]
  - An ECS cluster to group your services                                          [delete complete]       [3.1s]
  - Enable long ARN formats for the authenticated AWS principal                    [not started]            
  - An IAM Role to describe resources in your environment                       - Creating the infrastructure for the testapp-test environment.                    [rollback failed]  [128.2s]
  The following resource(s) failed to create: [EnableLongARNFormatFuncti                              
  on, PublicSubnet1RouteTableAssociation, ServiceDiscoveryNamespace, Def                              
  aultPublicRoute, PublicSubnet2RouteTableAssociation, EnvironmentManage                              
  rRole]. Rollback requested by user.                                                                 
  The following resource(s) failed to delete: [EnableLongARNFormatFuncti                              
  on].                                                                                                
  - An IAM Role for AWS CloudFormation to manage resources                         [delete skipped]    [15.7s]
  - An ECS cluster to group your services                                          [delete complete]  [3.1s]
  - Enable long ARN formats for the authenticated AWS principal                    [not started]       
  - An IAM Role to describe resources in your environment                          [delete skipped]   [3.5s]
    Resource creation cancelled                                                                       
  - A security group to allow your containers to talk to each other                [delete complete]  [0.0s]
  - An Internet Gateway to connect to the public internet                          [delete complete]  [15.2s]
  - Private subnet 1 for resources with no internet access                         [delete complete]  [13.0s]
  - Private subnet 2 for resources with no internet access                         [delete complete]  [13.0s]
  - Public subnet 1 for resources that can access the internet                     [delete complete]  [16.9s]
  - Public subnet 2 for resources that can access the internet                     [delete complete]  [13.4s]
  - A Virtual Private Cloud to control networking of your AWS resources            [delete complete]  [15.6s]
✘ stack testapp-test did not complete successfully and exited with status ROLLBACK_FAILED

image

image

I may have made an app with the same name previously, but all of the stacksets and resources/ roles were deleted.

iamhopaul123 commented 2 years ago

Hello @Nikola-Milovic. It seems like The IAM role failed to be created. Could you double check if any IAM roles have been created previously? For example testapp-test-EnvManagerRole.

Nikola-Milovic commented 2 years ago

@iamhopaul123 Tried again with clean roles and tried with another app with another name. My process is, init the app, then init 2 services and then try to create copilot env init with test env name.

These are my policies from the user I used to create the env

image

Not sure what else I could provide you with. It was working for previous apps, I am not sure what have I changed since then.

edit: Ignore the duplicated roles, created a group later on with the needed policies

iamhopaul123 commented 2 years ago

tried with another app with another name.

Does it still not work? Also would you mind to click into the event detail to see the root reason why it failed to create? e.g.,

Screen Shot 2021-08-09 at 12 27 42 PM
Nikola-Milovic commented 2 years ago

@iamhopaul123 So this is the stackset for testapp2-test environment (all others failed the same)

image

And I believe this is the root reason

image

iamhopaul123 commented 2 years ago

Yeah but this seems to be the reason why env stack rolllback failed. Could you give us the reason why copilot env init failed to create the environment stack? Like if you do a clean start (create the application and environment) what is the first error event the CFN gives?

Nikola-Milovic commented 2 years ago

@iamhopaul123

copilot env init
What is your environment's name? prod
Which credentials would you like to use to create prod? [profile kezual]
Would you like to use the default configuration for a new environment?
    - A new VPC with 2 AZs, 2 public subnets and 2 private subnets
    - A new ECS Cluster
    - New IAM Roles to manage services and jobs in your environment
 Yes, use default.
✔ Linked account 737543720601 and region eu-central-1 to application gittest.. 

✔ Proposing infrastructure changes for the gittest-prod environment. 
- Creating the infrastructure for the gittest-prod environment.                    [rollback in progress]  [58.8s]
  The following resource(s) failed to create: [EnableLongARNFormatFuncti                                   
  on, PublicSubnet1RouteTableAssociation, ServiceDiscoveryNamespace, Def                                   
  aultPublicRoute, PublicSubnet2RouteTableAssociation, EnvironmentManage                                   
  rRole]. Rollback requested by user.                                                                      
  - An IAM Role for AWS CloudFormation to manage resources                         [delete skipped]         [18.0s]
  - An ECS cluster to group your services                                          [delete complete]       [0.0s]
  - Enable long ARN formats for the authenticated AWS principal                    [not started]            
  - An IAM Role to describe resources in your environment                          [delete skipped]        [3.1s]
    Resource creation cancelled                                                                            
  - A security group to allow your containers to talk to each other                [delete complete]       [0.0s]
  - An Internet Gateway to connect to the public internet                          [create complete]       [16.0s]
  - Private subnet 1 for resources with no internet access                         [delete in progress]     [2.3s]
  - Private subnet 2 for resources with no internet access                         [delete in progress]     [2.3s]
  - Public subnet 1 for resources that can access the internet                     [create complete]       [17.3s]
  - Public subnet 2 for resources that can access the internet                     [create complete]       [17.3s]
  - A Virtual Private Cloud to control networking of your AWS resources            [create complete]       [16.9s]
^C

And in the console this is the first failed event

image

Clean install, new names for everything.

iamhopaul123 commented 2 years ago

Hello @Nikola-Milovic. It looks like the env profile you are using Kezual is not authorized to perform lambda:CreateFunction and based on previous info it is not allowed to perform lambda:DeleteFunction as well. Since you mentioned it worked before, i wonder if you used a different profile to create the environment or if the IAM user was modified?

Nikola-Milovic commented 2 years ago

@iamhopaul123 I probably used default by mistake... On this note, how do you usually prepare a user to use copilot? Do you manually add each policy or is there an easier way to not miss anything?

iamhopaul123 commented 2 years ago

@Nikola-Milovic We usually recommend initially setting up your application at first with admin credentials as it's the path with least resistance. Once you have the environment and a service up and running Copilot assumes an IAM role created in the environment stack to perform operations (aka the env manager role).