hackoregon / civic-devops

Master collection point for issues, procedures, and code to manage the HackOregon Civic platform
MIT License
11 stars 4 forks source link

Convert 2018 Disaster Resilience API service to Fargate #260

Closed MikeTheCanuck closed 5 years ago

MikeTheCanuck commented 5 years ago

Addresses #244 for the 2018 Disaster Resilience API. Follows the work in #259, and implements similar work as https://github.com/hackoregon/hackoregon-aws-infrastructure/pull/84.

Tests that will confirm the container has successfully migrated:

  1. CloudFormation will report UPDATE_COMPLETE for the enhanced stack
  2. ECS will report (a) "Launch Type FARGATE" for the service, (b) 1 Task Running at the Task level, and (c) at the Task detail level a "Started at" timestamp that is at least 5 minutes into the past (which indicates that the container stayed RUNNING long enough to pass the ALB health check cycle)
  3. latest log in CloudWatch group for the associated service will show similar entries as current log output 10.180.9.210 [09/Aug/2019:20:52:04 +0000] GET /disaster-resilience/ HTTP/1.1 200 23929 - ELB-HealthChecker/2.0 0.103165
  4. browser requests to https://service.civicpdx.org/disaster-resilience/api/Address/ will display a Swagger-schema-prettified response with count: 785067 and detailed JSON objects in the results section
  5. latest CloudWatch logs will display a recent web request to /disaster-resilience/api/Address/ with a 200 response code
MikeTheCanuck commented 5 years ago

Following the same sequence as was followed for 2018ND:

In the HackOregon-AWS-Infrastructure Repo

  1. Comment out the existing 2018DR Resource block in master.yaml
  2. Create a new Resource block that is temporarily named 2018DiRe and use the same Parameters as for the other 2018 and 2019 Fargate-based services, substituting DR-specific values
  3. Specifically and temporarily substitute unused values for ListenerRulePriority and ListenerRuleTlsPriority (e.g. "83" and "84" are currently unused, as would any other higher values in the 80-range).
  4. Deploy with that modified master.yaml - upload the modifiedmaster.yaml to our S3 bucket, then use it to "Update Stack"
  5. Wait for CloudFormation update to stabilize, and confirm the 2018 Disaster Resilience API is working properly.
  6. Modify the Resource name and listener Priority values back to their permanent values (now that the competing resources have been deleted from AWS during steps 4-5) e.g. 2018DR and "30" and "31").
  7. Update Stack once more with the latest modified master.yaml, verify the API is again working properly.
  8. Delete the no-longer-needed folder in /services for the service that's been migrated e.g. /services/disaster-resilience-service in the https://github.com/hackoregon/hackoregon-aws-infrastructure/ repo
  9. Commit all these changes (modified master.yaml, deleted service.yaml) to the infra repo as a PR.

In the API container's backend Repo

Update the ecs-deploy.sh script with the code referenced here: https://github.com/hackoregon/civic-devops/issues/268

In the API's Travis repo

Update the ECS_SERVICE_NAME environment variable with the new name assigned by ECS to the Service e.g. hacko-integration-2019Sandbox-17LKPAJ76VI06-Service-NY9MKVKFMBQL

MikeTheCanuck commented 5 years ago

Note: we discovered quite by accident (but should've remembered and tracked this problem) that the ecs-deploy.sh script used by Travis via each container repo's scripts needs to be updated to a more recent version than the pre-PR-129 code that we'd previously been using, since that PR is what added support for Fargate.

I've logged this as issue #268 and I've updated the above instructions since they're referenced for each other API migration.