Configure GHA workflow to deploy infrastructure and application to each environment

marcelovilla commented 3 weeks ago

There's an existing GHA workflow to deploy the infrastructure. However, it seems it has not been run yet and that there are some modifications that need to be made:

We should make sure the workflow gets triggered when committing to main (e.g., merging a PR) to deploy to the staging environment. It should also have a manual trigger to promote the deployment to production
We can use the AWS IAM role created in #61 to authenticate to AWS in the workflow
We should make sure we build and push the Docker image not only for the API but also for the dashboard
We should make sure we run docker compose up -d on the EC2 instance to use the newly built images
As we're using OpenTofu, we can leverage this action: https://github.com/opentofu/setup-opentofu
We can leverage GHA environments to reflect our own environments (i.e., shared, staging and production) and make sure we can distinguish between them

leej3 commented 3 weeks ago

Sounds good. Some notes.

it seems it has not been run yet

Correct. A placeholder more than anything. You can safely ignore it.

We should make sure we build and push the Docker image

There is now a base image to reduce redundant layers. That should be pushed too.

but also for the dashboard

At the moment I resorted to a hack and copied in some data to the dashboard container. Some strategy for data caching should be used here to avoid downloading from the db for each redeployment. It's not sensitive so I think a reasonably strategy might be to use a github cache, copy it to the deployed instance, and then mount the data into the dashboard container using a docker bind mount. Addressing issues that arise when the data schema changes in a backward incompatible way should be addressed. It might be that the data populated in a non blocking way.

marcelovilla commented 3 weeks ago

@leej3 thanks for the notes.

There is now a base image to reduce redundant layers. That should be pushed too.

It seems the base image is used to build both the API and dashboard images. Instead of pushing it too, I suggest we write our deploy workflow so that we build it locally, build the API and dashboard images on top of it, and then push the latter two.

At the moment I resorted to a hack and copied in some data to the dashboard container. Some strategy for data caching should be used here to avoid downloading from the db for each redeployment. It's not sensitive so I think a reasonably strategy might be to use a github cache, copy it to the deployed instance, and then mount the data into the dashboard container using a docker bind mount.

We'll explore what a good way of accomplishing this would be but I think your suggestion should work fine. Out of curiosity, why do we need to have data available in the container if it's already stored in the DB? Couldn't we query the DB on the fly from the application itself? Is it because it's a lot of data?

leej3 commented 3 weeks ago

Couldn't we query the DB on the fly from the application itself? Is it because it's a lot of data?

Good point. For local development it was taking about 3 mins, and lots of unnecessary internet usage. But if the download is happening on an EC2 instance it might be that the download happens quickly enough to not worry about caching. Try it first with no cache...

I suggest we write our deploy workflow so that we build it locally, build the API and dashboard images on top of it, and then push the latter two.

Sounds good.

nimh-dsst / osm

Configure GHA workflow to deploy infrastructure and application to each environment #64