NixOS / nixops

NixOps is a tool for deploying to NixOS machines in a network or cloud.
https://nixos.org/nixops
GNU Lesser General Public License v3.0
1.81k stars 363 forks source link

Add auto rollback. #385

Open kevincox opened 8 years ago

kevincox commented 8 years ago

It would be super snazzy if nixops had an automatic rollback system so that bad configs would cause the machine to rollback automatically rather then possibly becoming "bricked".

I imagine it would be something like a series of checks on a timer after every deploy. So if 60s after deploy the checks don't all pass it would trigger a rollback automatically.

A nice default check to have would be SSH access. This is because if you can't get SSH access you can't deploy new configs. I don't know the best way to implement this but a simple implementation would be a new SSH connection after deploy that would try to make a file and then the checks assert that the file exists. Or it would also be possible to not disconnect from the nodes until the checks finish. So do the SSH check while still logged in and if it fails use the existing deploy connection to rollback the machines.

domenkozar commented 7 years ago

Relevant http://blog.arkency.com/2016/11/recovering-unbootable-nixos-instance-using-hetzner-rescue-mode/

rsynnest commented 6 years ago

I believe @FPtje worked on an automatic rollback system which is being used in production by @basvandijk at LumiGuide, as discussed in this great presentation at NixCon: https://youtu.be/J4DgATIjx9E?list=PLgknCdxP89ReQzhfKwMYjLdwWsc7us8ns&t=1233

As Bas mentioned in the talk I think this would be a fantastic as a NixOS package, though I'm unsure if it's being worked on by anyone at the moment.