manheim / terraform-pipeline

A reusable pipeline library to apply terraform configuration serially across multiple environments, using Jenkins and a Jenkinsfile.
Apache License 2.0
66 stars 52 forks source link

I want to be able to track/manage configuration changes. #308

Open kmanning opened 4 years ago

kmanning commented 4 years ago

Goals:

kmanning commented 4 years ago

Hypothetical Solution if we were using ParameterStore:

  1. It's got a mix of sensitive and non-sensitive data. Example:
    • /MyTeam/MyApp/environment/DEPENDENCY_URL=nonsensitiveUrl,
    • /MyTeam/MyApp/environment/DEPENDENCY_PASSWORD=supersecret
  2. Let's say we creating a "staging" path that's separate from what the app uses. Example: /Staging/MyTeam/MyApp/environment/*
  3. Let's say my developers only have read/write access to /Staging/MyTeam/*, and DO NOT have access to what the app uses /MyTeam/*
  4. My devs add/remove/update whatever they want in /Staging/MyTeam
  5. I deploy my app to environment
  6. My deployment code diff's /Staging/MyTeam/MyApp/environment and /MyTeam/MyApp/environment
  7. On deployment, before moving forward, it prompts the user "Here are the config changes that will be applied to this environment - do you approve?"
  8. Any SecureStrings are not displayed - it only shows you that a parameter exists, and it's different. But doesn't show you the value. For non-SecureStrings, it shows you the before/after in plain text.
  9. You say yes, your infrastructure which has access to /MyTeam/MyApp/environment/* applies the diff that it found from /Staging/MyTeam/MyApp/environment/, then proceeds with the deployment, which reads from `/MyTeam/MyApp/environment/`

We could even optimize it. If no differences in config are found, it says so, and proceeds without asking you to review anything.

kemper commented 4 years ago

A related goal is to connect a git based change management process. For instance we have workflows where a developer creates a PR, it's reviewed, and then merged. The git commits are the hooks the result in our continuous integration / continuous. So, a concrete example where our workflow is broken is that we can make changes in parameter store and forget to manually generate a new pipeline/deployment/change. It would be nice if there was a way to have git drive all change control.

kemper commented 4 years ago

So, as a potential addition to the original proposed steps would be to add an identifier to the path for the staging path so that it's something like /MyTeam-Staging/change-id/MyApp/environment/* and I can refer to change-id as part of a PR in source control so that it's clear that I wanted exactly my changes applied.

kemper commented 4 years ago

Having a change-id also improves a workflow where if I have to make 2 changes to an app. I have a way of making those changes without my parameter store changes for the second change potentially getting picked up unintentionally when I'm deploying the first change.

kmanning commented 4 years ago

"So, as a potential addition to the original proposed steps would be to add an identifier to the path for the staging path so that it's something like /MyTeam-Staging/change-id/MyApp/environment/* and I can refer to change-id as part of a PR in source control so that it's clear that I wanted exactly my changes applied."

or terraform-pipeline, there are typically multiple environments. Would this change-id be specific to an environment? My thinking is that it would NEED to be environment-specific. To step away from terraform-pipeline, it would make sense that I would want to make changes to my non-production environment, and a similar change would not be applicable to production. Do you have thoughts in that regard?

kmanning commented 4 years ago

Having a change-id also improves a workflow where if I have to make 2 changes to an app. I have a way of making those changes without my parameter store changes for the second change potentially getting picked up unintentionally when I'm deploying the first change.

Conceptually this makes sense. I'm at a loss as to how to map the concept to a concrete implementation. Do you have particular implementations in mind? To take a very naive stab at an implementation, to clarify my confusion:

// Jenkinsfile
@Library(['terraform-pipeline@v5.0']) _

Jenkinsfile.init(this)

// Enable ParameterStoreBuildWrapperPlugin, using Change-Id Sets
ParameterStoreBuildWrapperPlugin.withChangeIds().init()

def validate = new TerraformValidateStage()

// How would I know what change-id QA should use?  How do I know what the previous change-id was for QA?  What does a change-id map to in ParameterStore?
// Maybe I look up ParameterStore /<Org>/<repo>/<environment>/CurrentChangeId -> this would have the "current change-id", whatever that is.  Maybe the CurrentChangeId is the root of another ParameterStorePath?
// Now that I have my change-id for QA, look up all the parameters at that path, and inject all those values in QA.
def deployQA = new TerraformEnvironmentStage('qa')
// Repeat for UAT
def deployUat = new TerraformEnvironmentStage('uat')
// Repeat for Prod
def deployProd = new TerraformEnvironmentStage('prod')

validate.then(deployQA)
        .then(deployUat)
        .then(deployProd)
        .build()

This is a possible implementation - is it a "good" implementation...? 😆 , I think maybe there's something better?

Also, I wrote this Issue as though ParameterStore were the desired implementation - which it may not be. You mentioned using git to facilitate the change-ids. If that were the case, I think I would maybe design a new/different plugin.

kmanning commented 4 years ago

As an aside, I'm less inclined to support this feature through a Github-based implementation. Configuration by nature can include sensitive data, like passwords. I'm less inclined to make it easy to store sensitive data for environment deployments in github. <-- that's not a straight-up "No," but I'm far more inclined to prioritize implementations for storage backends that are designed to safely store sensitive data (like ParameterStore, or others).

The great thing about plugins is that multiple implementations are possible, and plugins for terraform-pipeline can be written and used and don't even have to be built directly into the terraform-pipeline library itself.

kemper commented 4 years ago

Yeah, I spent some time chewing on how git could still be a part of it and I don't know that I have a good solution yet. Functionally this reminds of me of database migrations - if a project could have a sequential list of configuration changes that have been applied you would know when new ones need to be applied. So thinking of it in a terraform fashion you could have a module that allows someone to append new changeset id's to a list and then the final list of known changesets that have been applied would be stored in the terraform state. So, I'm not saying any actual config is in git just an identifier that can link back to a collection of changes in parameter store (or hypothetically some other secure source depending on a plugin).

kmanning commented 4 years ago

If we treat a git implementation separately, and focus on a ParameterStore implementation, here's a suggested possible implementation:

// Jenkinsfile
@Library(['terraform-pipeline@v5.0']) _

Jenkinsfile.init(this)

// Enable ParameterStoreBuildWrapperPlugin, using "staged" configuration in ParameterStore
ParameterStoreBuildWrapperPlugin.withStagedConfiguration().init()

def validate = new TerraformValidateStage()

// Look up the "staged" configuration, and by default, use the following convention: /StagedParameters/<Org>/<Repo>/qa
// Diff the "staged" path against the "application" path /<Org>/<Repo>/qa
// If there are differences, prompt the user "Here are the differences, you good?"
//     Any keys that have not changed would be ignored
//     Any keys that are new would  be marked as such
//     Any keys that were removed would be marked as such
//     Any keys that were updated would be marked as such
//         Any SecureStrings that differ would display their key but not the values
//         Any non-SecureString would display old/new values in plain text
// If the user "approves", update /<Org>/<Repo>/qa to mirror the  exact values  of  /StagedParameters/<Org>/<Repo>/qa
// If the user "rejects", do nothing, failing the job
// This setup assumes that some human has read/write access to /StagedParameters/<Org>/<Repo>/*, and no read/write access to /<Org>/<Repo>/* where terraform-pipeline will read from to do the deployment.  It assumes the platform running terraform-pipeline has  read  access to /StagedParameters/<Org>/<Repo>/*,  and read/write access to /<Org>/<Repo>/*
def deployQA = new TerraformEnvironmentStage('qa')
// Repeat for UAT
def deployUat = new TerraformEnvironmentStage('uat')
// Repeat for Prod
def deployProd = new TerraformEnvironmentStage('prod')

validate.then(deployQA)
        .then(deployUat)
        .then(deployProd)
        .build()

All the presumed defaults (like the /Staged/<Org>/<Repo>, etc) would simply be defaults, that could be optionally overwritten on the plugin itself. Eg:

ParameterStoreBuildWrapperPlugin.withStagedConfiguration()
                                .withStagePathPattern { options -> "/MyTeam/options['repoName']/MyCustomStage/options['environments'] }.init()
kemper commented 4 years ago

That sounds good to me. Strictly speaking if you consider the /StagedParameters/ path to be like a master branch in git then the other features around branching and PR's could be a thing that precedes parameters getting integrated into the staging path

kmanning commented 4 years ago

I'm gonna cut scope, to shrink the MVP even more. By default, it wouldn't even prompt you. All the above just happens, and gets printed to the output. If you wanted to be prompted, you could flip that on.

ParameterStoreBuildWrapperPlugin.withStagedConfiguration()
                                .confirmOnConfigurationDifferences(<true or false>)
                                .init()
jantman commented 4 years ago

I'm not quite sure why I didn't think of this yesterday, but I guess better late than never...

Why not just store all of your configuration in git, maybe as JSON or YAML, and then have your pipeline deploy the configuration to Parameter Store the same way it would deploy... anything else? And, as a result, also have diffs and PRs for your configuration just like anything else?

For regular "String" types in PS, this would work fine. It's not sensitive, it's just a cleartext string, so it can go in a repo.

For SecureString types, store the ciphertext in git[1]. Parameter Store has options to retrieve SecureString parameters "without decryption" (i.e. retrieve the ciphertext), so a diff could be accomplished easily. All you'd really need is a helper tool for people to use to update the config file, which would take a given string and KMS encrypt it, and put the result in the config file.

Note I haven't tried this process, but as far as I can tell, it should work...

[1] Note that PS encrypts with a symmetric key. I don't see any information on what cipher is used, but this is all assuming that for whatever secrets you have, PS/KMS is considered secure enough for storing the ciphertext in git.

kemper commented 4 years ago

@jantman 's idea appears to exist in other projects, here is one example: https://github.com/ukayani/kms-env

Or a more general article: https://medium.com/faun/aws-kms-encrypt-decrypt-environment-variables-497527e1c8cf

I hadn't considered dynamically encrypting and decrypting configuration but this concept ties really well into some other patterns I've considered for managing secrets. For instance with this pattern parameter store may not be needed anymore for many use cases.

jantman commented 4 years ago

For instance with this pattern parameter store may not be needed anymore for many use cases.

FWIW, I'm still suggesting using Parameter Store with this. I suppose that if you're running all your own code (including what's reading the params), then just using a flat file and KMS could work. However, one benefit of using a flat-file and feeding it in to PS is that it still works with other AWS services or third-party tools that support PS, and it still allows you to scope GetParameter permissions to the environment level.

kemper commented 4 years ago

Yeah, sorry, I was going off-topic/out-of-scope with that. I agree with everything you said; parameter store has advantages.