Soundness issue with std::env::remove_var

mbuesch commented 5 months ago

As described in the documentation, std::env::remove_var will become unsafe in a future version of Rust.

Even today the use of std::env::remove_var is unsound in multithreaded programs and can cause UB. See the issue for more information. I think this is a problem for sd-notify, because threads can be accessing the environment, while sd-notify calls remove_var.

Are there any plans how to handle this in sd-notify, yet?

Thanks for your opinions :)

lnicola commented 5 months ago

Thanks for bringing this to my attention. I don't really see a way to fix this. Pretty much, only options are:

take a breaking change and mark the sd-notify functions unsafe
deprecate the existing functions and add new ones
document the issue and hope the GLIBC lock is going to be enough in practice

I don't want to drop this functionality because it's important if you're going to spawn other processes. And in practice, it's not going to be such a big deal, since readiness notification is something that most programs only do on start-up.

So I'll probably update the docs in the next version, and begrudgingly mark the APIs as unsafe in 0.5.

mbuesch commented 5 months ago

Yes, this is hard to fix correctly. I think having a version of safe-APIs that can't reset the vars would be good. That way applications can still choose to only use safe API.

I think even having env manipulation only during startup can be a potential problem. In programs that use tokio threads can be spawned very early (e.g. for I/O, setting up the sockets or other file I/O). And, after all, READY=1 means that the program is not initializing anymore. It's ready to accept requests (I/O) and potential thread activity.

Just out of interest? Could you please also explain why do we even reset those envs? I know that libsystemd also does it. But what security(?) concern is addressed by this? What would be the cons, if we just let those env variables alone?

Thanks for your help and for your incredibly quick response.

lnicola commented 5 months ago

And, after all, READY=1 means that the program is not initializing anymore. It's ready to accept requests (I/O) and potential thread activity.

Yeah, I don't think this is necessarily going to be an issue with tokio, and I hope that most apps will read the environment variables they need right on start-up, not after they spawn some threads. But there's always the risk, of course.

Could you please also explain why do we even reset those envs?

Environment variables normally get inherited by child processes, so a program will want to prevent any children from talking to systemd on its behalf. That's not just about readiness notifications, but also things like the file descriptor store.

But in practice I think it's mostly to prevent a child process from believing it was started by systemd, and messing things up without actually trying to be malicious.

mbuesch commented 5 months ago

inherited by child processes

Ah, thanks for explaining this. Yes, this makes sense.

DanielSidhion commented 2 months ago

Yeah, I don't think this is necessarily going to be an issue with tokio, and I hope that most apps will read the environment variables they need right on start-up, not after they spawn some threads. But there's always the risk, of course.

I think the opposite will usually be the case, especially with code that uses #[tokio::main] on fn main(). From my understanding, threads will already be spawned before any code in main() runs. And some services will want to spawn tasks (e.g. server tasks, background processing tasks) before the program signals it's ready to systemd, and at that point it could happen that a thread may try to write/read the environment at the same time the call to remove_var is made.

I suggest adding a function to sd-notify to read the environment variable and unset it right away, and document that users are expected to do this as the first thing in their code (or before they spawn any other tasks). The value of that variable can then be passed to the other functions in sd-notify at any time, even after other tasks have been spawned. Of course, all of this can be wrapped by a struct to make it easier for consumers of the crate.

Thoughts?

mbuesch commented 2 months ago

My understanding is that multiple threads can be running at tokio main start, but they wouldn't do anything unless main schedules another async task.

But couldn't we check somehow if the current process has multiple threads? I mean, if we have a safe sd-notify function that internally calls unsafe { unset environment }, but first checks if the current process has only one thread. That should make a safe interface, right? If the current thread is the only one, a race condition with a thread startup should not be possible. But that would require a race-free way to check if another task is running (and also if it has just been started and might not be fully runnable yet). I'm not sure if there's a way to do that in a race free way.

lnicola / sd-notify

Soundness issue with std::env::remove_var #9