mitodl / ol-infrastructure

Infrastructure automation code for use by MIT Open Learning
BSD 3-Clause "New" or "Revised" License
43 stars 4 forks source link

Outstanding ECS Cluster items #1809

Open Ardiea opened 9 months ago

Ardiea commented 9 months ago

Branch https://github.com/mitodl/ol-infrastructure/tree/md/ecs_init

Ardiea commented 8 months ago

Gotchas with ECS / Traefik / Vault

root@ip-172-17-1-56:/etc/docker# docker ps
CONTAINER ID   IMAGE                            COMMAND                  CREATED             STATUS                       PORTS     NAMES
fed5626c3518   traefik:v2.10.4                  "/entrypoint.sh --ap…"   About an hour ago   Up About an hour (healthy)             ecs-data-ci-traefik-46-data-ci-traefik-c48599c2e2becdaf6300
4295b1aca17a   hashicorp/vault:latest           "docker-entrypoint.s…"   About an hour ago   Up About an hour (healthy)             ecs-data-ci-traefik-46-traefik-vault-agent-f4e89caaa8fa95e6cf01
27c6ff18d366   amazon/amazon-ecs-agent:latest   "/agent"                 3 weeks ago         Up 3 weeks (healthy)                   ecs-agent
Ardiea commented 8 months ago

Outstanding Issue - Environment Variables

So, there is one outstanding issue at the moment that I'm struggling with the best approach to and that is environment variables. ECS offers two ways to do env vars documented here. There is an extension/exception to that for secrets using SecretsManager but it isn't that interesting because we don't use that.

So, from the two provided methods we have the following.

  1. List keys + values out individulally for each env var inside the task.
    • Pro: Pretty straight-forward.
    • Con: Locked into static secrets at pulumi-run-time. Lose a lot of flexibility that comes with vault + consul for populating a lot of the more interesting bits of this config.
    • Con: Makes the task definition big and ungainly.
  2. Populate a .env file and stuff it some place safe in S3
    • Pro: Pretty straight-forward.
    • Con: Comes with a bunch of IAM foolishness to keep it secret, keep it safe).
    • Con: Locked into static secrets at pulumi-run-time. Lose a lot of flexibility that comes with vault + consul for populating a lot of the more interesting bits of this config.

Notably absent from that list is just a .env file on the local system. Probably because the underlying EC2 instances are supposed to be livestock, not pets. And livestock doesn't have any local files.

So I'm thinking something a little more flexible but probably more janky.

  1. Follow the already defined and explored pattern of vault/consul-template sidecar to render a file which is essentially our existing .env file for docker compose. This file is rendered into a shared volume.
  2. In the actual application containers, we add an entrypoint.sh that opens that file, loops through it and exports every key into the environment and then launches the app.

Doesn't envconsul do this already? Yeah, probably, but it is very particular about the keynames in consul and vault and re-organizing / cleaning those superfund sites up is outside the scope of this exploration.

blarghmatey commented 8 months ago

consul-template itself can also be used for spawning the process after rendering the config. It might make sense to use that as the entrypoint? https://github.com/hashicorp/consul-template/blob/main/docs/modes.md#exec-mode

Ardiea commented 8 months ago

Configuration Challenges

There is nothing analogous to a k8s configMap or a docker config in ECS which is presenting some issues. This SO comment covers basically the only options for getting files into containers with ECS: https://stackoverflow.com/a/71704130

Consider the following volume mount list for the nginx sidecar in OVS:

https://github.com/mitodl/ol-infrastructure/blob/main/src/bilder/images/odl_video_service/files/docker-compose.yaml.tmpl

Some of these files are static and unchanging, others require interpolation from vault, and some are rendered entirely from vault. Each of these situations requires a slightly different approach in order to get the configuration where it needs to be in the container. And nearly all of those approaches is going to be complicated and janky. Ultimately this is going to lead to an increase in complexity and boilerplate which is not what we're looking for at this time.