azavea / noaa-flood-mapping

NOAA Flood Inundation Mapping
7 stars 2 forks source link

Evaluate GitHub Actions for container image publishing and Terraform exeuction #76

Closed hectcastro closed 4 years ago

hectcastro commented 4 years ago

Can we safely use GitHub Actions to build container images, publish them to ECR, and run Terraform from within GitHub Actions?

colekettler commented 4 years ago

@hectcastro I took a pretty deep dive into this. Setting up container builds and publishing is simple enough. Terraform is a lot more complicated.

Unfortunately there's no easy, fire-and-forget solution. It's possible, but all of the potential approaches involve some changes to how we handle deployment, either at the scripting or infrastructural level. Here are the options I've considered, grouped by rough categories:


Repository / Actions Settings

Make build logs private

We evaluated this approach in azavea/operations#425 in the context of CircleCI.

This is not an option for Actions at this time. We can't make only the build logs private while keeping the repository public. As such, this isn't useful as a general approach and I recommend against it.

Terraform Settings

Indicate all sensitive values

We evlauated this approach in azavea/operations#425. There have not been significant developments in this area since.

At this time, this has to happen at the provider level and is not always consistent. There is an open issue to give consumers a way to indicate that variables are sensitive, but that feature is not going to be available within a definite timeframe. We should revisit this later on.

Masking Values

Mask values in output with tfmask

Source: https://github.com/cloudposse/tfmask

Pros:

Cons:

Recommendation: No. The potential for human error is too high and we wouldn't have a good way to detect if we've leaked something.

Mask values in output with terrahelp

Source: https://github.com/opencredo/terrahelp/tree/master/examples/mask

Pros:

Cons:

Recommendation: Probably no. This is better than the regex option, but losing that much non-sensitive information from our build logs could make it much more time-consuming and awkward to investigate failed deploys.

Mask values in output with GitHub Actions masking

Source: https://docs.github.com/en/actions/reference/workflow-commands-for-github-actions#masking-a-value-in-log Pros:

Cons:

Recommendation: No. Creates too much duplication and potential for either human or workflow errors.

Secrets Management

Manage sensitive values with AWS Secrets Manager instead of Terraform

Pros:

Cons:

Recommendation: A begrudging no. I love this idea conceptually, it's a very correct and complete approach, but I don't think it's practical for us in the slightest.

Deployment Architecture

Split deployment and build jobs

Pros:

Cons:

Recommendation: No. I don't think GitHub Actions offers enough distinguishing value to justify using it for the easy parts but still leaning on private CI for the more sensitive parts.

Redirect Terraform log output to S3

Pros:

Cons:

Recommendation: Maybe. It's a slight inconvenience, but I think we can make it less annoying by outputting S3 links like we do with ECS tasks in django-ecsmanage. There's room to smooth over the disconnect between workflow logs and Terraform logs.


All told, if I had to suggest one of these, I'd suggest redirecting plan and apply output to S3. This gives us a layer of access control that we're used to working with and allows us to script around some of its inconvenience if we want.

Barring that approach, I'd recommend we stick with private CI until there are significant developments in either Terraform or GitHub Actions to cover this use case.

colekettler commented 4 years ago

I also briefly looked into Terraform Cloud in case it offered any features that would help, but since it largely functions as a remote backend I think we're going to run into the same problems.

hectcastro commented 4 years ago

Nice write-up. The in-context pros/cons and follow-up recommendation made absorbing all of the context relatively lightweight.

Some quick comments on the various solutions:

In this case, it seems like leaning on Jenkins (probably the RF instance) would be the easiest approach with the least compromise. While CodeBuild would provide private CI too, I don't think it is worth engaging with its distinct quirks (relative to Jenkins).

Are you OK with that outcome?

colekettler commented 4 years ago

Thank you, glad to hear it!

Yeah, using Jenkins sounds like a good outcome to me. I'd rather hold out for upstream changes to make this possible without such significant compromises.

Good to close this one out?

hectcastro commented 4 years ago

👍

hectcastro commented 4 years ago

Please also open a new issue so that we can actually wire up Jenkins next sprint.

colekettler commented 4 years ago

I created #80 and pulled it into this sprint to keep it ahead of #64. I doubt we'll get to it but that ordering makes the most sense to me.