bitwalker / distillery

Simplify deployments in Elixir with OTP releases!
MIT License
2.97k stars 398 forks source link

Upgrade errors can result in situations where full restarts are needed #559

Closed Qqwy closed 6 years ago

Qqwy commented 6 years ago

Steps to reproduce

I recently moved a module out to its own separate package. Once attempting to deploy, I got a Multiple defined modules error. Similar cases exist (e.g. 'wrong version of ERTS' etc) that result in a release breaking the hot-upgrade process halfway through.

However, this was impossible to fix without a non-hot upgrade, because hot upgrades are not atomic: After copying the release.tar.gz, Distillery unpacks it and adds info to some configuration files; removing the 'broken' release is virtually impossible, so one is forced to perform a non-hot upgrade instead.

Proposed fix

bitwalker commented 6 years ago

To be clear, Distillery doesn't do the unpacking, the release handler does. Upgrades are atomic up to a certain point, which in the relup is called point_of_no_return, after which failures will require a reboot of the VM to address. It is unlikely that we can do anything to improve this, at least outside of OTP itself.

If you have some specific scenarios I can use to reproduce, we can definitely see if there are things Distillery can do to improve though.

As an aside, I'm not sure why you got the 'multiple defined modules' when applying the upgrade, Distillery prevents you from building releases with conflicting modules, but perhaps there is another reason why that occurred, if you have a way for me to reproduce, I'd like to investigate that.

A few things to keep in mind:

Qqwy commented 6 years ago

So in this case, I:

In this case, I 'fixed' it by:

I did have to install these manually (rather than using a hot upgrade) because obviously the error occured after the 'point of no return' that @bitwalker mentioned.

The main problem that Distillery might be able help with is then, that when you bump the version number with the fix needed after you've built a version that failed, Distillery will create the .relup/.appup from the broken version to the new one, rather than from the last working version to the new one. (So if you have 1, 2-broken and are building 3, then Distillery's .relup/.appup will go from 2-broken to 3, whereas the application is still on 1 and therefore cannot get to 3.)

Actually, would there be a way to tell Distillery to create .relup/.appups for the last N versions, rather than only the last one? Because that would not only be useful in this scenario, but also in the scenarios where you deploy and find out that you have some sort of logical regression (the deployment succeeds but you application has unwanted behaviour) that have you roll back, because currently in those situations, you have to go through the 'bad' version to get to the new verison as well.

bitwalker commented 6 years ago

Distillery won't know that in the 1/2/3 scenario that 2 was bad, you can tell Distillery that though using the --upfrom=1 flag to mix release. You can use this to upgrade from any version V to version V+N.

In cases where your upgrade succeeds, but has a bug, you can downgrade to the last version, using bin/myapp downgrade <last version>. You can do this multiple times, effectively reversing the order in which upgrades were applied. The only restriction is that you can only downgrade to a version which has a downgrade path defined from the current version. So if you built an upgrade from 1 to 2, and from 2 to 3, you can downgrade from 3 to 2, but you can't downgrade from 3 to 1 directly, you have to go 3->2, then 2->1.

I think you probably are correct about why the duplicate module situation occurred, but I suspect it is specifically due to the fact that the same module was moved into a dependency, which would mean that planga_phoenix would get loaded before planga had a chance to be upgraded (and the old version of the module unloaded). This is a case where you'd want to perform an intermediate upgrade which either removed that module, or renamed it, and then do the upgrade which adds that module in the new library. It's certainly annoying, but one of the caveats of hot upgrades, in that you have to coordinate changes like these carefully to make sure they happen in a predictable fashion, not so unlike database migrations. I doubt there is much Distillery can do in this case, but I'll look into it, because even if we just had the opportunity to warn about it, that'd save a lot of time.

Qqwy commented 5 years ago

Note: I'd like to point out that recently with the change of :cowboy being required for Phoenix to just :plug_cowboy being required, the same issue ('duplicate modules') happened again.