Refactor the way service executables are started

KillingSpark / rustysd

A service manager that is able to run "traditional" systemd services, written in rust

MIT License

495 stars 15 forks source link

Refactor the way service executables are started #42

Closed KillingSpark closed 1 year ago

KillingSpark commented 4 years ago

Currently there are two ways processes for services are started.

Normal std::process::Command for helpers
Custom stuff for the main executable

I am certain that even with the Command extensions there is stuff systemd does, that needs more specialized stuff. Expecially setting env vars is not async-signal-safe which is necessary to be called after fork but before exec. It is probably fine on most platforms since rustysd does not mess with env vars anywhere else before forking, but I like to be pedantic about this kind of stuff.

The solution here is to chain-load another executable, that does all the setup that is not necessarily done in rustysd, and them in turn execs the actual service/helper process.

rustysd (just fd setup?) -> rsdexec (most setup) -> service

This reduces complexity in the codepath main rustysd has and provides this as a (hopefully) useful tool to others. It would increase compatibility too, systemd spawns helpers with the same preparations, as it does the main executable.

See branch redesing_fork_exec for progress on this.

KillingSpark commented 4 years ago

The idea is to export all necessary info as json and reimport it in rsdexec, then apply the config to itself and exec into the specified command.

This is inefficient, but I think it is a good idea to make this as general-purpose as possible. Many other tools in the containerization space use json for their spawning specification (e.g. osi stuff). Introducing the relatively less wide-spread ini-style config systemd uses seems like an even worse choice.

KillingSpark commented 1 year ago

Ok after two years I have new ideas for this.

chainloading is still the way
We should not pass json in cmd args
Pass the steps to perform in a memfd. This concept exists on at least linux and freebsd. We could just patch that in as the stdin to allow this to still be a general-purpose tool that just reads the config from stdin (and optionally from any path if given via a cli?)
Encoding as json is probaby still ok
I think the chainloading should happen through the rustysd executable itself. This enables at least on linux to just exec /proc/self/exe which is easier and arguably safer than execing another binary.

valpackett commented 1 year ago

This concept exists on at least linux and freebsd

shm_open exists in all unixes, memfd (also shm_open(SHM_ANON) for much longer on FreeBSD) is just cleaner as it never touches the filesystem and you don't have to unlink anything. I actually made a tiny crate wrapping all that stuff: shmemfdrs

just exec /proc/self/exe

btw, std::env::current_exe supports like all the platform specific versions of that

Expecially setting env vars is not async-signal-safe which is necessary to be called after fork but before exec

Wait, uh, there's pre_exec if you just want to do that? But it seems that right now it's not std::process::Command that's used but manual forking via the nix crate.. not sure what are the issues exactly with that design. But I guess with a chainload you get to be cool and use posix_spawn for the run step :)

KillingSpark commented 1 year ago

That crate looks cool, I think I'll use it :)

Uh I didn't find the env::current_exe that's very neat!

Yeah the forking is just something I got very paranoid about after having to debug why the forked processes just deadlocked on setting an env variable. Makes sense in the end but now I just want to make sure I know what happens while doing the forking and setup for the subprocesses

KillingSpark commented 1 year ago

To elaborte on the environment variable thing: it is just unsafe to set an environment variable between fork and exec, because you cannot know if there wasn't another thread manipulating the environment at the time fork was called.

Expecially setting env vars is not async-signal-safe which is necessary to be called after fork but before exec

What I tried to express here was that all operations that are executed between fork and exec (e.g. in process::pre_exec) MUST be async-signal-safe, which setenv is not.