robur-coop / albatross

Albatross: orchestrate and manage MirageOS unikernels with Solo5
ISC License
142 stars 17 forks source link

boot loop + `--restart-on-fail` results in unremovable unikernels #39

Closed yomimono closed 3 years ago

yomimono commented 3 years ago

Like a dolt, I sent a unikernel to albatross which boots up and immediately fails with the --restart-on-fail flag set. When I try to destroy the unikernel, I'm told it can't be destroyed because it's not running, but when I try to push another unikernel with the same name, I'm told there is already a unikernel running under that name.

I've removed --restart-on-fail from my default deploy-albatross target for the moment, since I have little faith in my ability to only push non-crashing unikernels.

hannesm commented 3 years ago

hi, thanks for your report: (a) you can --force deploy to "kill unikernel with same name if already running" before "start unikernel" (b) how does the unikernel exit? the default restart-on-failure excludes solo5 tender signals (exit code = 1), and also 60..64 (mirage/functoria emits 63 for --help/--version and 64 in case of argument parse error).

yomimono commented 3 years ago

how does the unikernel exit?

That one fails to come up fully because I asked it to use an x509 certificate it didn't like for TLS connections, and there's plenty of "if TLS setup doesn't work, panic" logic in my webserver unikernel.

hannesm commented 3 years ago

I'm reopening since the current semantics is rather tedious: I'd expect a "restart-on-fail" unikernel to be destroyable - even though it is in a boot loop - by executing the destroy command once -- at any point of the boot loop (creating the unikernel / successful creation / termination / restart).