Open jameshadfield opened 1 year ago
Some background: I held off (for years) on introducing something like Terraform because it is additional complexity. But eventually I really couldn't justify holding off any more. I needed to set up testing resources in AWS that paralleled production resources and wanted their definitions to stay in sync and be version controlled so we could reason about them and correlate them with codebase changes, without having to investigate them every time. I introduced Terraform because it solved a real need, not because I just wanted to.
When I did introduce Terraform, I specifically set it up so we could adopt it incrementally over time, giving people more of a chance to learn it and not requiring a sudden change of practice. I wrote detailed docs about how to use it and walked interested folks thru doing so.
Going forward, I think it would be very valuable and prudent to adopt more of our AWS resources under the control of Terraform. To me this looks like existing resources being adopted as we make significant changes to them and new resources being added straight away. New resources are easier to add than existing resources and adding them straight away means less future work and an initial direction that's aligned with the long-term direction.
The biggest hurdle of using Terraform, from my perspective, is figuring out how to introduce it to a system that was previously uncontrolled. For nextstrain.org, we're already over that with the configuration framework I set up. Yes, Terraform is certainly additional complexity in its own right, but overall it saves more than it adds, esp. once the "down payment" of set up is already paid, and the ratio increases over time.
The benefits of Terraform (or other infrastructure control tools) are very much akin to the benefits of Git (or other version control tools). It's useful with even a single developer and is a game-changer with a team.
I agree with the benefits @tsibley mentioned and would like to expand Terraform coverage over time. I see how this is a shift from previous status quo and might add some friction to getting things done - importantly the friction should be minimal. Ideally there is an easy path of making changes to resources in the AWS console and bringing those changes into Terraform. Some scenarios:
terraform plan
.I put out a call for further discussion on Slack, and that's generated a little more commentary there.
We (@jameshadfield @joverlee521 @j23414 @huddlej @kimandrews @victorlin @tsibley) talked for 75 minutes on this topic during today's dev chat, but consensus on an expectation was not reached. To summarize, I think fairly, but correct me if wrong:
Folks see pros and cons. There's a fair bit of apathy/agnosticism. There's some desire to increase familiarity with Terraform more broadly across the team before either committing to or rejecting an expectation of its increased usage over time and incremental adoption. To aid that, I've offered to run another walk thru of nextstrain.org's current Terraform usage, as well as walk thru a new separate example of using Terraform to manage new GitHub Actions / AWS Batch integrations that I plan to write as part of our automation priority.
The outcome is to maintain the status quo where I'm continuing to incrementally (and slowly) manage more resources with Terraform as the team continues to gain familiarity, but without any expectation for any one else to use it for new resources.
Personally, to be frank, it is exhausting to have to advocate for what would be basic best practice accepted (even expected!) in many other software shops. :/
@victorlin wrote:
- A resource is created on AWS console. This is covered by docs on import and seems straightforward, though I haven't had the chance to try it myself.
I just learned about Terraform's import
blocks (not the terraform import
command) in combination with terraform plan -generate-out-config
to generate the initial Terraform resource definition based on a resource that already exists. So this makes that case easier as you don't have to start from scratch there. See also this importing tutorial.
as well as walk thru a new separate example of using Terraform to manage new GitHub Actions / AWS Batch integrations that I plan to write as part of our automation priority.
The beginning of that: https://github.com/nextstrain/infra/compare/main...dev
@tsibley states that these todos "need doing sooner [rather] than later":
And via this comment:
These changes were made in AWS as part of PR #719, which included documentation explaining the changes. I would favour a wider discussion regarding expanding our usage of terraform, and how this expansion is going to be done (if indeed it is). Requiring any new AWS changes to be brought into terraform when the changes are to AWS services (or entire classes of services) which are not part of our existing terraform config is a change from previous development practices.