Reuse modules for other cloud providers

madmod commented 6 years ago

This is an open ended meta issue. If there is a better place to discuss this please let me know.

I notice that there are several different Vault Terraform projects for AWS, GCP, and Azure. (And others which Hashicorp has collaborated on like https://github.com/GoogleCloudPlatform/terraform-google-vault) I haven't done a thorough comparison, but I have noticed small differences in the common modules inputs and outputs. (Besides the obvious ones like S3 vs GCS buckets.) There is also some duplicate work going on in order to keep up to date with Vault config changes, and some nice enhancements to common code like building the Vault config in pieces with more options in the AWS run-vault module, which are absent from the Google repo.

Maybe it would be worthwhile to share this code and abstract out the platform specific portions (like calling gsutil vs aws s3) into separate modules or scripts? Sharing the code for common modules like vault-run in a single repo and relying on the Packer manifests to abstract away cloud provider or distribution specific differences sounds like a good pattern.

josh-padnick commented 6 years ago

You raise a valid point. Even while building these modules we noticed a lot of copy & pasting going on. A central repo is a compelling idea, but to be honest we haven't felt the pain of duplication enough to motivate setting one up. I also generally worry about the increase in complexity this will result in, or the new edge cases we may introduce.

As we ramp up our usage of the GCP modules, this calculus may change. I'd welcome other perspectives on this. @brikis98 Any thoughts?

brikis98 commented 6 years ago

Yea, we scratched our heads about this one quite a bit as well. On the one hand, we want to keep code DRY. On the other, there are non-trivial differences between the cloud providers, and striving to make things too DRY may lead to a poor UX, increased complexity, and bugs.

If we look into it, we can break the code down into a few categories:

Terraform code. Most of this is 100% cloud specific and cannot be shared.
Generic bash functions. All the bash scripts in these repos have copy/pasted methods such as assert_not_empty and replace_in_file. We've already built a common solution for these in https://github.com/gruntwork-io/bash-commons/ and just need to migrate these repos to use it.
Other bash code. Of the remaining bash code, some pieces will be reusable (e.g., the basic install steps, the basic config generation) but some is cloud specific (e.g., how to auto discover other nodes). I don't have a sense of how much of the code is reusable. If it's enough, it may be worth extracting into a separate repo. However, it would have to be a considerable amount to make it worth the extra hassle of having to check out and version a totally separate repo for this stuff.
Test code. Similar to the bash code, there are pieces that will be resuable and pieces that won't be. We'd have to do the same arithmetic to figure out if it's worth extracting pieces.

hashicorp / terraform-google-vault

Reuse modules for other cloud providers #18