aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.37k stars 3.78k forks source link

cdk should create service linked roles #4108

Open robertd opened 4 years ago

robertd commented 4 years ago

:bug: Bug Report

What is the problem?

I’m having an issue with setting up AWS::ApplicationAutoScaling::ScalableTarget for fargate service. Apparently cdk doesn’t create service-linked roles. I found similar issue (diff service) but it was closed #3734. Are we expected to pre-create these roles? I think the expected behavior should be for cdk to create the service-linked roles.

Reproduction Steps

    const fargateService = new ecs.FargateService(this, "fargateService", {
      cluster,
      taskDefinition: taskDef,
      desiredCount: 1,
      propagateTaskTagsFrom: ecs.PropagatedTagSource.SERVICE
    });

    const scaling = tnmaccessService.autoScaleTaskCount({
      minCapacity: 0,
      maxCapacity: 1
    });

    scaling.scaleOnSchedule("scalingUp", {
      schedule: Schedule.expression("cron(0 15 ? * MON-FRI *)"), // MON-FRI @ 9am MT
      minCapacity: 1,
      maxCapacity: 1
    });

    scaling.scaleOnSchedule("scalingDown", {
      schedule: Schedule.expression("cron(0 22 ? * MON-FRI *)"), // MON-FRI @ 4pm MT
      minCapacity: 0,
      maxCapacity: 0
    });

Verbose Log

3/4 | 11:46:15 AM | CREATE_FAILED        | AWS::ApplicationAutoScaling::ScalableTarget | fargateService/TaskCount/Target (fargateServiceTaskCountTarget655BF465) 
Unable to assume IAM role: arn:aws:iam::1234567890:role/aws-service-role/ecs.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_ECSService 
(Service: AWSApplicationAutoScaling; Status Code: 400; Error Code: ValidationException; Request ID: e0fa3fe9-d8a9-11e9-9457-e58085b5712e)

Environment

Other information

https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-service-linked-roles.html

rix0rrr commented 4 years ago

Hmm, interesting. You're right that CDK does not create service linked roles. The problem is that we cannot do that reliably. We can't automatically add AWS::IAM::ServiceLinkedRole to a stack, because a second attempt to create it will fail (and so every stack after the first one won't be able to deploy anymore).

We therefore rely on services to automatically create the service linked role when it's necessary.

I was under the impression that Application AutoScaling did do that if you preconstructed the role ARN correctly. Did they change that behavior? Or are we generating the ARN incorrectly?

robertd commented 4 years ago

I was under the impression that Application AutoScaling did do that if you preconstructed the role ARN correctly. Did they change that behavior? Or are we generating the ARN incorrectly?

RoleARN:
        Fn::Join:
          - ""
          - - "arn:"
            - Ref: AWS::Partition
            - ":iam::"
            - Ref: AWS::AccountId
            - :role/aws-service-role/ecs.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_ECSService

I believe ARNs are created correctly. Is this documented anywhere in the docs that users needs to create their service linked roles manually? If this is the case, what (and where) would be the best place to document these?

There are multiple roles out there that need to be pre-created if that's the case. https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-service-linked-roles.html

Is it possible to create AWS::IAM::ServiceLinkedRole through cdk using higher level constructs as of today? From what I can tell only CfnServiceLinkedRole can do this at the moment. Possibly a good candidate for feature request / PR?

The problem is that we cannot do that reliably. We can't automatically add AWS::IAM::ServiceLinkedRole to a stack, because a second attempt to create it will fail (and so every stack after the first one won't be able to deploy anymore).

Would auto-creating a role with suffix fix the issue ? i.e: AWSServiceRoleForApplicationAutoScaling_ECSServiceABCD12345

Is role repetition (per stack) the issue here? In other words... same role (with different ARNs) being repeated over and over, depending on how many cdk projects that are utilizing ECS autoscaling are being used by users of the aws account?

rix0rrr commented 4 years ago

I believe ARNs are created correctly. Is this documented anywhere in the docs that users needs to create their service linked roles manually? If this is the case, what (and where) would be the best place to document these?

No, but I was under the impression (based on my testing) that AppScaling did create their role automatically. Maybe I was wrong, in which case we should document it. The README of the appscaling package seems like a good place to put it.

Is it possible to create AWS::IAM::ServiceLinkedRole through cdk using higher level constructs as of today? From what I can tell only CfnServiceLinkedRole can do this at the moment. Possibly a good candidate for feature request / PR?

Maybe, but we'd first have to confirm that it makes sense to create them in CloudFormation stacks first.

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-iam-servicelinkedrole.html

I guess what we could be doing is setting CustomSuffix to the stack name. However, the docs contain this rather ominous line:

Some services do not support the CustomSuffix parameter. If you provide an optional suffix and the operation fails, try the operation again without the suffix.

Also don't forget that this would eat into our 200 resources per stack limit.

I'd rather keep this out of the generated CDK stacks, if at all possible. If anything, I'm thinking this could/should be a cdk bootstrap feature.

robertd commented 4 years ago

Also don't forget that this would eat into our 200 resources per stack limit.

I'd rather keep this out of the generated CDK stacks, if at all possible. If anything, I'm thinking this could/should be a cdk bootstrap feature.

Oh yeah... I forgot about that limit 🤦‍♂️. This becoming a part of cdk bootstrap feature sounds like a good approach to me.

holmesjr commented 4 years ago

I can confirm this is an issue even if CDK creates a simple Fargate service with no autoscaling etc. If ECS has never run in the account before by the wizard or via the CLI, a CF deployment will not create the service-linked role. I had to manually create and then delete a fargate cluster in the console to create and link the role - deployment worked after that.

pgollucci commented 4 years ago
for svc in $(aws help | awk '/o /{ print $2 }' | tail -225 | xargs); do
 aws iam create-service-linked-role --aws-service-name $svc.amazonaws.com
done

aws iam list-roles --output text --query 'Roles[].[RoleId,RoleName,Arn]' | grep AWSServiceRoleFor

There are 58 for my account. (When an account was created likely matters)

I'm in favor doing this in the bootstrap I just use the simple for look and run it right after bootstrap from ci/cd. Yeah I ignore the errors but it doesn't seem to matter.

rix0rrr commented 3 years ago

Come to think of it, cdk bootstrap is not the right place for this either.

Best solution I can think of involves a Custom Resource

polothy commented 2 years ago

Another place to address:

https://github.com/aws/aws-cdk/blob/b60367526f0b962ed0466237a6e9db2b0931a50f/packages/%40aws-cdk/aws-batch/lib/compute-environment.ts#L387-L393

Our understanding is if you do not specify a service role then it'll use the service-linked role, but the CDK construct prevents you from doing that because it makes a role for you if you want to leave it empty (we are curretly trying to trick it).

Edit: link to CFN docs https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-batch-computeenvironment.html#aws-resource-batch-computeenvironment-properties

peterwoodworth commented 2 years ago

I can't find any issues specifically related to this in the cloudformation roadmap

if there's anything we need from CloudFormation - what would it be? Even if it doesn't work for all resources, did we try to see if the customSuffix field works for application autoscaling?

madeline-k commented 1 year ago

I was under the impression that the service linked roles would be created automatically by the service, the first time you deploy. We should try to repro this, and then decide on a path forward.

tmokmss commented 1 year ago

Jfyi I've published a construct to manage service-linked roles in a better way: https://constructs.dev/packages/upsert-slr/

Also I found that in some cases a dependent service-linked role is implicitly created, for example in AppSync (ref). The handling of SLRs doesn't seem very consistent across AWS services even on CloudFormation side.

cartermckinnon commented 1 year ago

@madeline-k Services do create their SLR, but services differ in what API call(s) trigger it, how quickly it happens; plus some variable propagation delay from IAM. IMO, CDK should offer some way of managing these to remove the ambiguity.

gscpw commented 8 months ago

This same issue applies to OpenSearchService as well.