cloudfoundry / capi-ci

Apache License 2.0
8 stars 16 forks source link

Kiki is not actually testing that migrations are backward compatible #49

Closed sethboyles closed 1 month ago

sethboyles commented 4 months ago

In this commit, the kiki apply_latest_migrations.sh script was switched from using proxychains to apply migrations to using bosh ssh.

This unfortunately defeats the purpose of the kiki environment. From capi-ci README.md:

Kiki starts with an older version of cf-deployment. It then runs the new migrations, but keeps the old cloud controller code. This catches any backwards-incompatible migrations. This is important because cloud controller instances do rolling upgrades. For example: if you write a migration that drops a table, old CC instances that depend on that table existing will crash during the rolling deploy.

The script previously used proxychains (or tsocks in the original commit) as a way of running migrations from the concourse VM's local version of capi-release (which is newer than what's deployed) against the foundation's database, so that newer migrations are run WHILE the foundation is still using code from an older version.

Using bosh ssh just runs the migrations that are already deployed to the VM (i.e. no new migrations are run). So since this change kiki, has not been actually testing that new migrations are backward compatible.

What to do

Make the apply_latest_migrations script actually apply latest migrations.

I think we have a few options:

  1. We can revert back to using proxychains. This may be hard because we apparently can't edit /etc/hosts with this change to using registry-image resource instead of docker-image (and is how we stumbled upon this issue)

  2. Maybe we can look into using bosh scp to copy the new capi-release (perhaps in /tmp) to the vm and then using bosh ssh.

  3. Some other way?

philippthun commented 1 month ago

I've updated the apply_latest_migrations task.

What the script now does:

  1. Upload capi-release (release candidate) tarball with bosh scp.
  2. Unpack (parts of) the tarball (bosh ssh).
  3. Copy the db folder (with migrations and helpers subfolders) over to /var/vcap/packages/cloud_controller_ng (bosh ssh).
  4. Run bundle exec rake db:migrate (bosh ssh).

I think as long as the first rule how to write migrations is followed - Do not use any Cloud Controller code - this approach might be sufficient.

philippthun commented 1 month ago

@sethboyles - Do you think we can close this issue, i.e. do you agree that my changes to apply_latest_migrations are "good enough"?

sethboyles commented 1 month ago

Thanks for bringing your change to my attention. Yes, I think that works. Thank you for making it!