robur-coop / albatross

Albatross: orchestrate and manage MirageOS unikernels with Solo5
ISC License
141 stars 17 forks source link

Invert stats communication (stats is a client to albatross-daemon) #131

Closed hannesm closed 1 year ago

hannesm commented 1 year ago

Previously, albatross-daemon was eventually connecting to stats (with --enable-stats and --retries). This did not allow to setup stats at a later point. This also did not allow stats to crash.

Now stats is connecting at startup to albatross-daemon (and whenever that fails, reconnects). Once initially connected, albatross-daemon will dump all running unikernel information so that stats can start collecting. This allows the stats daemon to be independently restarted, or surviving an albatross-daemon.

hannesm commented 1 year ago

What's your thoughts on broadcasting on a stats_condition to allow multiple connected stats daemons? With the current stats daemon I don't think it makes sense to run more than one.

I thought about that... but I don't think it makes sense to run multiple stats daemons. It may at some point make sense to have per arc (resource policy) a separate stats daemon.

hannesm commented 1 year ago

Does the last commit make sense, @reynir? Would be great if you could review that one.

reynir commented 1 year ago

The nix build fails with:

Missing dependency:
 - http-lwt-client >= 0.2.0
   no matching version

I don't know how the nix build works, but I suspect we can change nodes.mirage-opam-overlays.locked.rev to a newer commit of opam-repository. Is this correct, @Julow?

Julow commented 1 year ago

It's indeed a problem of outdated repository. The nix file doesn't define the repository so a simple command can't update them yet, I've fixed it here: https://github.com/roburio/albatross/pull/132 (targeted to this branch)