stelligent / mu

A full-stack DevOps on AWS framework
https://getmu.io
MIT License
973 stars 135 forks source link

Deploy on Acceptance stage failing #416

Open williamn opened 5 years ago

williamn commented 5 years ago

I tried to follow the Quickstart tutorial but deployment on Acceptance stage is failing, here is the output from mu svc show:

Pipeline URL:   https://console.aws.amazon.com/codesuite/codepipeline/pipelines/mu-56-qs/view?region=ap-southeast-1
+------------+----------+------------------------------------------+--------------------------------------------+---------------------+
|   STAGE    |  ACTION  |                 REVISION                 |                   STATUS                   |     LAST UPDATE     |
+------------+----------+------------------------------------------+--------------------------------------------+---------------------+
| Source     | Source   | d7141868f0293916a8f1bb19422a192bdf52a7cc | Succeeded                                  | 2018-12-21 14:17:38 |
| Build      | Artifact |                                        - | Succeeded                                  | 2018-12-21 14:18:11 |
| Build      | Image    |                                        - | Succeeded                                  | 2018-12-21 14:19:16 |
| Acceptance | Deploy   |                                        - | Failed Build terminated with state: FAILED | 2018-12-21 14:26:01 |
| Acceptance | Test     |                                        - | -                                          |                   - |
| Production | Approve  |                                        - | -                                          |                   - |
| Production | Deploy   |                                        - | -                                          |                   - |
| Production | Test     |                                        - | -                                          |                   - |
+------------+----------+------------------------------------------+--------------------------------------------+---------------------+

Deployments:
+-------------+----------+------------------+---------------------+
| ENVIRONMENT | REVISION |      STATUS      |     LAST UPDATE     |
+-------------+----------+------------------+---------------------+
| production  | d714186  | CREATE_COMPLETE  | 2018-12-21 14:13:55 |
| acceptance  | d714186  | CREATE_COMPLETE  | 2018-12-21 14:13:55 |
+-------------+----------+------------------+---------------------+

Took a look at the CloudWatch logs, here is what I found:

[31mlogEventStatus ▶ ERROR [0m mu-56-environment-acceptance: ContainerInstances (AWS::AutoScaling::LaunchConfiguration) CREATE_FAILED API: autoscaling:CreateLaunchConfiguration You are not authorized to call EC2 Describe operations. It is required to perform CreateLaunchConfiguration operation.

[31mlogEventStatus ▶ ERROR [0m mu-56-environment-acceptance: Host2HostRuleEgress (AWS::EC2::SecurityGroupEgress) CREATE_FAILED Resource creation cancelled

[31mlogEventStatus ▶ ERROR [0m mu-56-environment-acceptance: ContainerInstances (AWS::AutoScaling::LaunchConfiguration) CREATE_FAILED API: autoscaling:CreateLaunchConfiguration You are not authorized to call EC2 Describe operations. It is required to perform CreateLaunchConfiguration operation.

[31mlogEventStatus ▶ ERROR [0m mu-56-environment-acceptance: Host2HostRuleEgress (AWS::EC2::SecurityGroupEgress) CREATE_FAILED Resource creation cancelled

I ran the command mu svc show using an AdministratorAccess granted IAM user. So I guess permission should not be the issue here.

waynerobinson commented 5 years ago

I'm also getting this on deploy.

Not sure if anything has changed here but it seems that autoscaling:CreateLaunchConfiguration needs some specific extra EC2 permissions to execute that don't seem to be present in the deploy-cluster policy.

waynerobinson commented 5 years ago

I also can't seem to work out a way to override that policy as the custom CloudFormation templates attempt to append the extra policies to the existing array and I can't just modify the CloudFormation template directly as the mu env upsert command will override the mu-iam-common template each time it's run.

waynerobinson commented 5 years ago

By the way, in case it's relevant, I'm using version 1.5.11-develop (because of some other errors that the current stable version didn't seem to be working with).

waynerobinson commented 5 years ago

OK, a couple of updates on this issue.

I have managed to test out some alternative permissions by changing the role manually after mu-iam-common was created (as CloudFormation doesn't automatically check for drift).

I needed to add both ec2:* and iam:CreateServiceLinkedRole to the deploy-cluster policy and it now seems to create correctly. I'm not sure what the exact extra ec2 permissions required are as I just went a bit nuclear on it. But it may just be the "describe" operations as listed in the error above.

waynerobinson commented 5 years ago

When changing ec2:* to ec2:Describe* the deploy step still works. So can probably limit these permissions in this case to just ec2:Describe* and iam:CreateServiceLinkedRole.

chris-d-edwards commented 5 years ago

Hi, I had the same issue but using mu 1.5.10. I've changed the ARN / Resource ID opt in for Fargate/ECS, as highlighted here: issues/414 and this appears to work without the above IAM changes.

williamn commented 5 years ago

@chris-d-edwards I tried to opt in for the new ARN / Resource ID format but still getting the same error. I am using mu version 1.5.10

chris-d-edwards commented 5 years ago

Hi @williamn you need to opt out, so the boxes should be unchecked. Apologies my comment wasn't clear, if you read through the 414 issue, it states that the new format causes the issue. Did you do this for the root account (I had done this) or for your specific user which should be a the same IAM your using in the command line.

maxieduncan commented 5 years ago

I'm facing the same issue. Changing the ARN / Resource ID format had no impact (I was already opted out and opting in made no difference).

I've tried adding the permissions mentioned above to the role that was being assumed when the error is reported but haven't had any luck.

maxieduncan commented 5 years ago

Actually I was configuring the additional roles in the wrong place, once I added them to the mu-cloudformation-common-us-east-1 Role, the deployment worked as expected.

drummerjoe commented 5 years ago

Just tried the quickstart and had this same issue. Shouldn't the quickstart work out of the box?