FairwindsOps / pentagon

A framework for building repeatable, containerized, cloud-based infrastructure as code with Kubernetes.
https://www.reactiveops.com
Apache License 2.0
183 stars 25 forks source link

Create migration for latest kops instance groups improvement #152

Closed sudermanjr closed 5 years ago

sudermanjr commented 5 years ago

As discussed in https://github.com/reactiveops/pentagon/pull/151, create a migration to go from one kops ig to one-per-az

ejether commented 5 years ago

I'd also like to use this issue as a discussion point for what sorts of changes should also come with a migration:

Pentagon Migrations were intended as a way to make sure that project organization, file contents, and standards could easily be applied to existing projects so that all new changes can be 'immediately' adopted by existing projects and those projects can be pinned at a version.

So my current stance is that any changes in the defaults (content or organization) should include a migration. Unless your project is incorporating the latest standards, it can't be considered the latest version can it? Upgrading the project may involve changes to the cluster or other cloud resources and that is a natural result of upgrading the project version to me.

davekonopka commented 5 years ago

After looking at the linked PR one thing that sticks out is that some migrations might need a transition state. We'd want to add the new IG's and leave the existing IG in place in some form to be decommissioned intentionally.

ejether commented 5 years ago

That's an interesting point...
Creating an automated process for a Shrodinger's version sounds like it could be very challenging. With the current way the migrations are done with git, ( commit to a new branch -> update cloud resources -> merge to master) is that enough of a transitional state? Thoughts on that?

davekonopka commented 5 years ago

Yeah that would probably work. Or maybe in some cases preserving existing code with comments to cleanup and/or moving to another file that can be cleaned up manually. I just wouldn't want to encourage situations where running apply without understanding would cause major problems.

ejether commented 5 years ago

Agreed we should not be encouraging anyone to run anything by rote. I hope no one would do that anyway but I take your meaning. In the past, strongly worded log messages have the safeguard around that. The code should be preserved in the other branch so I'm not sure I'd want to do anything thing tricky with commenting in place but migration instructions for something like this would be important,

For the migration in question here, the user would likely understand that applying this change will incur some risk and apply the new instance groups carefully but later might forget to delete the old instance group and/or forget to reconfigure the cluster auto-scaler and that wouldn't cause any obvious issues until later.

ejether commented 5 years ago

Follow up to in person discussion: Migrations should be made when files change There may still exist time when migrations might not be feasible so discuss when in doubt.

ejether commented 5 years ago

resolved by #158