hackoregon / civic-devops

Master collection point for issues, procedures, and code to manage the HackOregon Civic platform
MIT License
11 stars 4 forks source link

Upgrade EC2 hosts to latest ECS-optimized AMI #145

Closed MikeTheCanuck closed 6 years ago

MikeTheCanuck commented 6 years ago

Current AMI used by the EC2 hosts that are carrying the load of our ECS containers is one last updated 2016-09.

This is where the update needs to happen: https://github.com/hackoregon/hackoregon-aws-infrastructure/blob/master/infrastructure/ecs-cluster.yaml#L59

Intended Benefits

ToDo

  1. [x] identify the latest supported ECS-optimized AMI
  2. [x] update the YAML to the new AMI identifier and deploy to S3 (see instructions here)
  3. [x] terminate one of the two EC2 hosts via the EC2 console, which should cause CF or ECS (whichever controls this behaviour) to spin up a new host with the currently-specified AMI -- NOTE: watch out that the remaining EC2 host doesn't get overloaded with more container tasks - and thus more memory load - than it can handle; spin down legacy services such as "Transport-Service" temporarily if necessary
  4. [x] Manually spin down and spin up services by settings "number of tasks" in the ECS console to 1, then back to 2 - this will cause the newly-spun container tasks to launch on the new EC2 host
  5. [x] Ensure there is at least one container task (for each service that we need currently alive) running on the new EC2 host; don't sweat if we leave a few of the 2017 API container tasks aside for the moment, the outage probably won't affect anyone since they're so little-used
  6. [x] Once the new EC2 host has stable running instances of the intended API container tasks, terminate the other/older EC2 host, and take similar steps to deploy all the API container tasks to the newest EC2 host.
iant01 commented 6 years ago

There are/may be better ways to accomplish the upgrade steps indicated above, I will test them out in my account and document. Killing off the instance manually probably added to the issues we had after clearing up the instance that had memmory spce issues.

iant01 commented 6 years ago

Once we are up on a newer image version, we will be able to change the cloudformation template to auto magically update when ever the template is run after the recommended image version has been updated:

Parameters: ECSAMI: Description: AMI ID Type: AWS::SSM::Parameter::Value Default: /aws/service/ecs/optimized-ami/amazon-linux/recommended/image_id

iant01 commented 6 years ago

The latest image is:

ami-d2f489aa amzn-ami-2018.03.a-amazon-ecs-optimized

iant01 commented 6 years ago

Rest of check list items not needed. as soon as change set is executed on the cloudformation template change, the autoscaler will migrate the instance to the new AMI image. May need to manually re-balance depending on timing since the placement strategy option will not be available until we are on the latest AMI .

iant01 commented 6 years ago

merge pull request 35 submitted.

iant01 commented 6 years ago

Closing, resolved by mereged PR