ruralinnovation / multi-svc-cartodb

Containerized, Multi-service, Open-Source CartoDB
BSD 3-Clause "New" or "Revised" License
9 stars 9 forks source link

Determine whether ECS or EC2 are appropriate for redis / postgis #43

Closed nballenger closed 5 years ago

nballenger commented 5 years ago

I've got a belief, which may be inaccurate, that DB instances should live in EC2 and not ECS. If that's true it means using AMIs to build EC2 instances, if it isn't it means using ECS Task Definitions to run ECS Services. Either way requires a persistent storage layer in EBS or EFS.

nballenger commented 5 years ago

I've reached out to one of the AWS developer relations people for container services, to get an expert opinion.

nballenger commented 5 years ago

Response back from the dev-rel:

I'd just run the Postgres on a dedicated EC2 instance unless it is super low traffic and low usage. Databases don't tend to share well, so I wouldn't want to ever schedule other containers alongside a database container.

Plus the issue of persisting the data. You'd have to do extra work to ensure that you are using placement strategies to target a specific EC2 instance that has the data mounted in an EBS volume anyway. Containers would give minimal benefit to the database, but extra complexity.

When you run the whole stack of Docker containers locally its just one computer. Things are a bit different in production, you should be using multiple instances there. Container orchestration is designed for dynamic workloads moving around those instances.

Stateless stuff moves around very nicely, stateful stuff does not because it has to carry that state with it (it needs state close to it for performance) but the state is heavy. You don't want to have to move a 200 GB database around much, you want it staying in one place.

Seems fairly definitive--I don't think the hedge about "low traffic and low usage" applies in this case, because even though that will be true for a dev environment, it won't be true for production.

So Postgres needs to be on EC2, and because of the way that Carto uses Redis (as a second database, not just an ephemeral cache) it does too.

nballenger commented 5 years ago

Interesting note--I think that it should be possible to use Elasticache to supply the Redis cluster, since Elasticache can support Redis persistence, and there are no custom Redis commands to support.

Persistence info here: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/backups.html