astro / microvm.nix

NixOS MicroVMs
https://astro.github.io/microvm.nix/
MIT License
1.24k stars 93 forks source link

Use writeShellScript* to remove redundant shebang; Use absolute paths for virtiofsd; Use supervisord to spawn virtiofsd processes instead of disowning them #257

Closed SuperSandro2000 closed 1 month ago

SuperSandro2000 commented 1 month ago

Tested on a real world none-declerative microvm setup.

astro commented 1 month ago

These make a lot of sense. Thank you very much!

PatrickDaG commented 1 month ago

The supervisord part seems to break my setup.

Due to the service type being simple now it counts as started as soon as the main process, which is supervisord now, forks of. This then signals the main microvm service that its dependencies are met and it can start running Since the actual starting of virtiofsd takes some time( roughly a second on my limited tests) the sockets aren't actually available yet and microvm crashes as it can't open them. (this then kills the virtiofsd service again and the loop begins anew)

Is there a particular reason you switched to supervisord, in which case we would need to introduce some kind of better dependecy checking.

SuperSandro2000 commented 1 month ago

Using disown is pretty hacky and sometimes things break in unexpected ways.

Not sure if changing the type to forking would fix this.

PatrickDaG commented 1 month ago

Yeah I think the forking method only works if your main process exits which it did in the disown method, but doesn't anymore. You need a way for supervisord to check if the virtiofsd processes started and only then notify systemd. I don't know much about supervisord but from what I can see this looks to bet rather annoying to do.

PatrickDaG commented 1 month ago

I think the best way would be to split each virtiofsd proces into their own systemd service which can then use type exec which does exactly what we want.

SuperSandro2000 commented 1 month ago

Which we could do for fully declarative microvms but not for the others because we would need two template variables in systemd but it only has one.

PatrickDaG commented 1 month ago

Then I guess the easiest fix would be reverting the commit, or coding something using systemd notify type services, but that might be also kinda brittle and not trivial to do.

SuperSandro2000 commented 1 month ago

Then we end up again with to many virtiofsd processes in some groups creating very strange bugs. Using disown is the worst way to solve this. We rather at an artificial delay of 3 seconds

SuperSandro2000 commented 1 month ago

I think we can easily fix this by combining the daemons into a group and using events to trigger an systems sd-notify command

http://supervisord.org/events.html

SuperSandro2000 commented 1 month ago

see #258