pulumi / docs

All things Pulumi docs!
https://pulumi.com
Apache License 2.0
127 stars 222 forks source link

Document needs a working example for ECS Classic #11744

Open mikemaccana opened 3 years ago

mikemaccana commented 3 years ago

File: themes/default/content/docs/guides/crosswalk/aws/ecs.md

Hi there,

I'm looking at a solution to manage a machine learning ECS cluster, potentially looking at being a Pulumi customer if I can get Pulumi to work. As Fargate doesn't support GPU instances, we have to use ECS Classic.

I cannot get this example working for EC2 Classic - this guide is really difficult for ECS classic users. Nearly every item that's about 'ECS' seems to call Fargate - I know that:

If you want to schedule non-Fargate Tasks and Services, you will need to create a cluster explicitly, since you will need to define an Auto Scaling Group that controls the EC2 instances powering it.

And I've attempted to follow the guide with:


// Create an ECS cluster.
const ecsCluster = new awsx.ecs.Cluster(config.ecs.name, {
  vpc,
  tags: {
    Name: config.ecs.name,
  },
});

// Required to use EC2Service - see https://www.pulumi.com/docs/guides/crosswalk/aws/ecs/
// Fails with
//   * error waiting for CloudFormation Stack creation: failed to create CloudFormation stack, rollback requested (ROLLBACK_COMPLETE): ["The following resource(s) failed to create: [Instances]. Rollback requested by user." "Received 0 SUCCESS signal(s) out of 5.  Unable to satisfy 100% MinSuccessfulInstancesPercent requirement"]

// https://www.pulumi.com/docs/guides/crosswalk/aws/ecs/#creating-an-auto-scaling-group-for-ecs-cluster-instances
const autoScaleGroup = ecsCluster.createAutoScalingGroup(
  config.ecs.autoScalingGroupName,
  {
    vpc,
    templateParameters: { minSize: 5 },
    launchConfigurationArgs: {
      instanceType: config.ecs.instanceType as aws.ec2.InstanceType,
    },
  }
);

// Define the Networking for our service
// https://www.pulumi.com/docs/reference/pkg/aws/lb/loadbalancer/
const applicationLoadBalancer = new awsx.lb.ApplicationLoadBalancer(
  "hl-pulumi-lb",
  {
    vpc,
    external: true,
    securityGroups: ecsCluster.securityGroups,
    subnets: vpc.privateSubnetIds,
  }
);

// Listen on port 80
// TODO: add HTTPS cert to ALB
const http = applicationLoadBalancer.createListener("http", {
  port: PORTS["http"],
  external: true,
});

// https://www.pulumi.com/docs/reference/pkg/nodejs/pulumi/awsx/ecs/#EC2Service

// This is EC2 'Classic' - see https://www.pulumi.com/docs/guides/crosswalk/aws/ecs/
const ec2service = new awsx.ecs.EC2Service(config.ecs.ec2ServiceName, {
  // https://www.pulumi.com/docs/reference/pkg/nodejs/pulumi/awsx/ecs/#EC2ServiceArgs
  cluster: ecsCluster,
  desiredCount: config.ecs.instanceCount,
  taskDefinitionArgs: {
    // https://www.pulumi.com/docs/reference/pkg/nodejs/pulumi/awsx/ecs/#EC2TaskDefinitionArgs
    containers: {
      humanloop: {
        // https://www.pulumi.com/docs/reference/pkg/nodejs/pulumi/awsx/ecs/#Container
        memory: 128,
        portMappings: [http],
        image: "nginx", // Try generic nginx image //config.ecs.image,
      },
    },
  },
});

but it would be much easier if there was a working ECS Classic example in a GitHub repository.

leezen commented 3 years ago

While this is a Python example, you may want to take a look at https://github.com/pulumi/examples/tree/master/aws-py-ecs-instances-autoapi which uses EC2 instances as the underlying capacity provider instead of Fargate.

mikemaccana commented 3 years ago

Thanks for coming back @leezen ! I've actually solved this one myself I believe - the autoScaleGroup configuration used in this article mentioned fails because it's missing health checks. The awsx TypeScript SDK specifically: there should either be a default for this value (in keeping with awsx role as high value Pulumi specific value adds) or these keys should be required - on that basis I'd mark this issue as a bug rather than a Feature Enhancement:

Here's the fix for the autoScaleGroup section:

const autoScaleGroup = ecsCluster.createAutoScalingGroup(
  config.ecs.autoScalingGroupName,
  {
    vpc,
    templateParameters: {
      minSize: 1,
      maxSize: 5,
      healthCheckGracePeriod: 300,
      healthCheckType: "EC2",
    },
    launchConfigurationArgs: {
      instanceType: config.ecs.instanceType as aws.ec2.InstanceType,
    },
  }
);
leezen commented 3 years ago

@mikemaccana Thanks for clarifying! Any chance you'd like to submit a PR with that fix?

mikemaccana commented 3 years ago

Hrm I stand corrected - the issue didn't occur for a while, but is now back:

aws:cloudformation:Stack (hl-pulumi-autoscale-group):
error: 1 error occurred:
  * creating urn:pulumi:hl-pulumi::humanloop::awsx:x:ecs:Cluster$awsx:x:autoscaling:AutoScalingGroup$aws:cloudformation/stack:Stack::hl-pulumi-autoscale-group: 1 error occurred:
  * error waiting for CloudFormation Stack creation: failed to create CloudFormation stack, rollback requested (ROLLBACK_COMPLETE): ["The following resource(s) failed to create: [Instances]. Rollback requested by user." "Received 0 SUCCESS signal(s) out of 1.  Unable to satisfy 100% MinSuccessfulInstancesPercent requirement"]

I don't think it's the health checks.