aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.34k stars 3.76k forks source link

aws-ecs: currently it is not possible to construct fully functional ECS Service from ARN or Attributes #30436

Open bbmarkus opened 4 weeks ago

bbmarkus commented 4 weeks ago

Describe the feature

Currently it is not possible to cleanly manage connectivity on the ECS service level for existing resources created outside of the stack (e.g. anything retrieved by ARN/Attributes in CDK).

Additionally, for example, for ICluster an equivalent connections property is manually exposed, without implementing IConnectable, is there any reason for this? Wouldn't consistent use of IConnectable be preferred?

Use Case

For us the use case is simple, we need permanent infrastructure with some ECS services (Fargate for us, but should not matter here) which is created as part of one stack, but also temporary infrastructure which needs to connect to the services of the primary permanent stack created in another stack.

We would prefer avoiding routing the traffic through any load balancers (internal or otherwise), and unfortunately service connect doesn't help either since some of our ECS tasks are scheduled and thus "serviceless" (although I am unsure if Service Connect would even side step the need of security groups here for us anyway, since we ruled it out early on, in favor of classic service discovery for the above limitation it has).

Proposed Solution

See title, BaseService appears ready to include this interface as is, if you wish for some reason to avoid dependency to the ec2 interface IBaseService should alternatively expose connections property directly, similar to ICluster.

Other Information

Upon further inspection it seems that what I would actually be looking for is an equivalent of fromLookup() for ECS resources such that we could construct a fully functional CDK construct from just an ARN or a more limited set of attributes.

Is there a particular reason why CDK seems so stingy about providing this kind of functionality? (this is hardly the first time I come across this, e.g. accessing legacy RDS instances created by hand is a pain since we have to manually provide all the information even though surely there would be a way to do it with API calls similar to how VPC lookup is done).

There seems to be a feature to add context provider plugins, but it is for some unknown reason marked as internal use only, even though it would be useful as a general availability feature. For example when for one reason or another only partial CDK usage is possible.

Acknowledgements

CDK version used

2.144.0 (Typescript)

Environment details (OS name and version, etc.)

Windows / WSL (Ubuntu)

pahud commented 3 weeks ago

Looking at your use cases, sounds like you are trying to connect services from one stack to another using service discovery or service connect? I haven't tried it before across two stacks but why do you think having IBaseService to expose connections property would be useful?

Also, can you share some very minimal code snippets to illustrate what you are trying to do between 2 stacks?

bbmarkus commented 3 weeks ago

@pahud Thanks for your reply, I will try to demonstrate the use case with some details and actual code. But what I essentially needed to do is:

  1. In one cluster with 2 or more stacks
  2. In each stack we have services and scheduled tasks (FargateService and ecspatterns.ScheduledFargateTask respectively)
  3. We need to ensure that certain services can connect to each other and all scheduled tasks can connect to their service dependencies (without going over public internet, as that is how we used to do it before migrating to ECS).
    • Additionally the services health checks need to call other services to ensure certain services are up before others are marked healthy (which form CDK point of view means I need precise control over deployment order, so service discovery cloudmap is up to date when these calls happen, since service connect is not usable from scheduled tasks since there is no service for those).
    • Why not internal LB? Answer is honestly costs since we can just spin up extra ECS task to handle internal traffic with service discovery (and health check pings are minimal).

So here is what I have done now (abridged, showing only the bit for services but something very similar is needed for scheduled tasks as well to ensure the services they depend on are discoverable and connectable):

      // services are added to the map by id when created for lookup later
      protected services = new Map<string, ecs.FargateService>();

      // props is separate configuration object to reduce CDK boilerplate, and allow re-use (hindisight: custom constructs would be better)
      protected forEachServiceDependency(props: ServiceProps, callback: (dep: ecs.FargateService) => void) {
        props.depends
          ?.map((svc) => this.services.get(svc))
          .filter(<T>(item?: T): item is T => item !== undefined)
          .forEach(callback);
      }

Then after service & scheduled task creation I need to do something like:

      // add dependencies between services to ensure correct deployment order for private DNS discovery and connectivity
      this.forEachServiceDependency(props, (dep) => {
        service.node.addDependency(dep);
        if (dep.taskDefinition.defaultContainer) {
          dep.connections.allowFrom(
            service, // for scheduled tasks need to pass in security group directly since ScheduledFargateTask is not Connectable
            ec2.Port.tcp(dep.taskDefinition.defaultContainer.containerPort),
          );
        }
      });

This works "fine" inside a single stack, but not across stacks (since sharing those internal service and task maps would get clunky real fast and even if it works it would at least generate ton of exports in CloudFormation) and for that use case I only need to update the security groups obviously since we can ensure the deployment order by the order of CDK deploy commands.

Originally I wanted to have the stacks setup like this, because we only need Stack 2 once a year for a short while

But after implementing the above workaround I was forced to move Scheduled tasks D and E to the same stack, to make things work and able to open container ports without passing over security group objects or ids themselves when creating the stack instances in the CDK app.

I opened this issue just because I wanted to use IFargateService instead of FargateService in my map because that feels more correct, but then realized that in order to avoid having to put everything with dependencies in their own stacks we would rather need a way to get rid of the need for those maps entirely. Such as have the ARN of the the dependencies configured in the second stack, but that will not give me a working connections property (which then leaves the security group ids, but having CDK still be able to create them automatically, as it does by default very often, would be preferred over having to manage them by hand).

Sorry for failing miserably at keeping this brief 🙂

rantoniuk commented 2 weeks ago

I have a somewhat similar scenario but simpler:

Even though ServiceConnect is set up fully via CDK and exposes the service in CloudMap, the security group is not allowing the traffic and to make it work I had to do:


    props.vpc.privateSubnets.forEach((subnet) => {
      serverService.connections.allowFrom(ec2.Peer.ipv4(subnet.ipv4CidrBlock), ec2.Port.tcp(1234));
    });