CloudFormation stack template

numtel commented 7 years ago

Hey, I saw this repo and it was much more put together than the example app I had started on a few months ago so I've made an AWS CloudFormation template that makes it very easy to deploy the application to AWS using ECS.

There's an option for Development/Production. If Development is chosen, a single EC2 instance is used for the cluster and a sandbox Postgres server container (db/Dockerfile) will be launched. (Like using docker-compose locally)

For production, an EC2 auto-scaling group is created and the number of instances can be configured in the stack template.

An internal load balancer is created for the PostgREST service and an external one for the OpenResty service. The desired number of containers for each service can be configured. Logs all go to CloudWatch. It doesn't seem like the upstream nginx config works for anything except the "postgrest:3000," even when specified with the ENV vars in this file. I'm not sure the best way to fix that yet. As it is right now, I can see the "Welcome to OpenResty" page on the root but anything under /rest/ gives a 500 because it can't connect to the upstream.

Prevously, I've used proxy_pass like this to put nginx in front of a set of ELBs:

    server {
      resolver 172.30.0.2;
      set $search_host http://some-elb-hostname;

      location /search/ {
          proxy_pass $search_host;
      }
    }

Putting the host in a variable like that causes nginx to resolve the IP address instead of it only happening once on startup if you don't use the variable. Since ELB IPs are liable to change every few hours, this is necessary. Is it a good idea to switch to that kind of config or is there another way to make it work?

The CPU/Memory limits on each container are not yet calibrated.
In setups like this before, I've had to add CRON jobs to cleanup old Docker images to prevent the servers from running out of disk space but the way I've done it previously was before it was built into Docker engine itself. This would be good to have in the instance user data as well. More info...
It will serve HTTPS (haven't tested this yet with this template) if a IAM or ACM certificate ARN is given. It would be helpful then to have some kind of nginx redirect to always HTTPS probably.

ruslantalpa commented 7 years ago

@numtel this is great! I'm glad you reached out with cloudformation template for ECS. This is exactly how i was planning on making it go to production. I'll review your template but as i said i thought about it for a while and also have some preliminary template. Maybe we can merge your work and mine in this PR and you could take over the "going to production" part since you obviously have a good understanding of cloudformation.

As for the problem with the LB ip you describe, i am not sure you need an internal LB (between openresty and postgrest). I think it's better to have a external LB and the service be a "unit" of OR+PGRST and if you scale it, you scale them together.

I'll also look exactly at how you tried to add other stuff and it did not work just to explain so you know how things fit together. The dns to ip indeed is tricky since openresty needs a resolver in the config to work and i did not look yet at how ti pass that from the ecs instance to the container.

I'll have more details on how to continue with this PR later today/tomorrow

ruslantalpa commented 7 years ago

So i looked at the PR (i hope in detail) and these are the ideas i have and the problems i see with this current template.

A lot of the template is dedicated to cluster infrastructure. Maybe it's better to start from the fact that the user already has a ECS cluster started from aws interface and request only the ARN of that cluster as a parameter. Additionally we could include here the cloudformation template used by aws to start the cluster as a separate file and provide command line to bring up the cluster in console then get it's arn.
If someone is going for ECS, it's probable that the value is to run all your apps on that cluster and not have a dedicated cluster per app. That's the reasoning behind my thinking of splitting infrastructure definition from the definition of the app. Also, for each app running in the cluster we can create a dedicated IAM user with privileges specific only to the resources of that app.
In addition to the cluster, another piece of infrastructure that can be shared by all apps is the loadbalancer. Let's create another separate template that creates a loadbalancer within the same vps as the cluster, it should get as a parameter the arn of the cluster so that it can read the VPC config and then export a few arn's of it's own like the listeners.
Now that we have the infrastructure on which postgrest based apps can run ( cluster.yaml loadbalancer.yaml ), we create a application.yaml file which is responsible for starting services within that cluster and loadbalancer which were provided as parameters.

I have some rough templates, which are split into user and application, but they can be combined user.yaml https://gist.github.com/ruslantalpa/aac48b6601c8dd01301800cc2c95d4f1

application.yaml https://gist.github.com/ruslantalpa/1b0e09abb4fa434130670ba9202aca4e

secrets-entrypoint.sh https://gist.github.com/ruslantalpa/68ed1da56c78a0506745994c29b465b0

one thing that is different from your config is that the params are read from a .env file like in local dev. I am not sure if this is exactly needed since it creates some complications with S3 buckets, maybe it should be eliminated in this starter kit

So, what do you think about:

splitting the file into cluster/loadbalancer/application (with cluster being a reference to one started in aws interface)
do you want to take on this section (these templates will require a lot of work before they are a smooth process)

ruslantalpa commented 7 years ago

about env vars, in order for them to be available in openresty, they also need to be added here https://github.com/subzerocloud/postgrest-starter-kit/blob/master/openresty/nginx/conf/includes/globals/env_vars.conf

also note how because we don't have a resolver, we translate the dns here https://github.com/subzerocloud/postgrest-starter-kit/blob/master/openresty/entrypoint.sh#L8

there is probably a better way, but it's like this for now

numtel commented 7 years ago

Wow thanks for such the response! Splitting the file into multiple definitely makes it easier to code. Having a single stack template just makes it so easy to deploy that I was intrigued to try it that way.

For the ECS cluster, is this the amazon provided example that you've seen? The custom resource lambda I used that gets the latest AMI is largely from the docs as well. I could merge that into this example template as cluster.yml.

As for the env var OpenResty issue, it would go away if there's no internal load balancer because then openresty and postgrest could share a task definition and the containers could be linked just like in docker-compose. That would make things much simpler. It looks like you even make the development version have the sandbox postgres server part of the task definition as well.

Putting the secrets file in S3 is a good idea for security. Looks like you saw this blog post. In your application.yml, you create one of these files. Out of curiousity, is that Custom::S3PutObject resource built-in or do you have to install these cloudformation helpers separately?

Reading that blog post makes it seem like handling the secrets in S3 makes it so having a VPC/Cluster for isolating the application valuable from a security standpoint since the S3 bucket will allow reading from the whole VPC. I don't know why they don't have the bucket policy and VPC endpoint as part of the cloudformation template in that blog post example. I don't see any reason why that couldn't be part of the stack template.

What do you think of splitting the templates this way?

`cluster.yml`	`application.yml`
VPC Secrets S3 Bucket Auto-scaling group ECS Cluster Load Balancer	Task Definition ECS Service

Create cluster.yml stack, output will contain an s3 bucket/key that will need to have the secrets file uploaded to.
User creates/copies secrets file to S3.
Create application.yml, one of the parameters is the cluster stack name so that its outputs can be read.

It's necessary to split the stacks at the secrets file upload this because if application.yml was part of cluster.yml, it would never finish creating since the containers would never survive the load balancer health check.

I guess that's the advantage to having the internal load balancer for postgrest, there's then a health check on each service. Without the internal load balancer, it will be important to have the external load balancer health check request the /rest/ route (instead of something like /) so that it verifies that both Openresty and Postgrest are still functional.

I do like writing Cloudformation templates but I don't have any experience running Postgrest or Openresty in production. Smoothing out the stack templates can always go on and on, with new data there's adjustments and alarms to be made to ensure the system doesn't crash in the same way again. I'll see how it goes taking on this section though.

ruslantalpa commented 7 years ago

one needs to create the cloudformation helpers stack then reference it but maybe in the beginning we'll avoid it altogether since the s3 thing brings in a lot complications to the template. Maybe let's start by setting secrets as env vars for now. This is a starter kit after all :) and people with security concerns can dig into saving them in s3 files.

When i talk about cluster template i mean this https://gist.github.com/ruslantalpa/cc21cc4d6aa24e56ff3c4ba372dcc147

if you go to the aws interface and go through the wizard of bringing up ECS, this template is the result.

I was thinking, instead of managing this here, we could leverage the wizard AWS provides (i am guessing they put a lot of thought into that template) to bring up a cluster and just say: step 1: go to your aws and create your cluster, then remember the name and arn (or maybe people already have their cluster)

this saves us from maintaining a complicated template. The additional infrastructure needed would go into loadbalancer.yaml

We can probably strip out the "application user" part and his policies. After all, most of the people will run this with their aws admin account and it's good enough to get started.

thoughts? is it possible in a template, knowing just the arn of the cluster/stack name to extract the name of it's resources even though there are no output parameters? If not, i would try to start exactly form this template then have a clear defined section with our additional stuff (loadbalancer and all) so that we can keep the AWS portion of the template updated from time to time with new features.

numtel commented 7 years ago

Whoa, I'm pretty sure the last time I made a cluster using the console like that, the only option was to "Create an Empty Cluster." Yeah we should definitely use that. AWS is always improving. I bet when we get these templates sorted, they'll have some new feature that makes them obsolete :joy:

Using the built-in method for getting stack outputs has no inclination of grabbing the resources directly, just outputs. Creating a custom resource lambda, like the one I used to get the AMI, to fetch the stack resource data would not be difficult. Give me a bit, I'll make a new commit that takes these things into account.

numtel commented 7 years ago

I'm going to close this PR in favor of the new #5

subzerocloud / postgrest-starter-kit

CloudFormation stack template #3