Open panhania opened 1 year ago
Related discussion: rust-lang/unsafe-code-guidelines#434.
Is there a particular reason fds are being passed around this way?
Is there a particular reason fds are being passed around this way?
Maybe @bgalehouse could shed some light here as the original author of Fleetspeak. My guess would be because it is simple and in Go nobody bothers with safety anyway. Do you have any suggestions on how one could pass file descriptors (cross-process and cross-language) more safely?
More specific question: where do these fds come from? Is there some method by which the other process is obtaining fds for this one without this process knowing? I'm not a linux I/O expert but in my experience typically the only way to get a valid fd is to open it somehow in your own process.
I think the "fully safe" way to do this would be for this process to keep track of such fds it creates and then select from that list. But that really depends on where these fds originate. (This may have unacceptable perf implications)
More specific question: where do these fds come from?
So, Fleetspeak is something that allows endpoint agents (such as GRR) to talk to a remote server. Each of these endpoint agents is spawned as a child process of Fleetspeak that handles these file descriptors for you. This crate is something that such endpoint agents written in Rust (at the moment: specifically RRG) can use to communicate with this parent process.
Is there some method by which the other process is obtaining fds for this one without this process knowing?
Fleetspeak does not handle these file descriptors to other processes. Each child process it spawns receives its own unique pair to communicate with Fleetspeak. Of course in the world of operating systems other privileged processes can sniff file descriptors open by other processes and do fancy stuff but this is not something we can help (but this is the same as playing around with /dev/mem
).
I'm not a linux I/O expert but in my experience typically the only way to get a valid fd is to open it somehow in your own process.
... or by inheriting it from the parent in this case (I believe).
I think the best way to think about these two descriptors handled by Fleetspeak to its child processes as alternative stdin
and stdout
(so, the parent process can use to to send data and receive data from the child but nobody else). As for why it is custom descriptors and not just stdin
and stdout
—I don't know, I guess some agents might want to use these for other purposes (e.g. logging).
Aha, parent processes, forgot about that.
So the situation is basically that this is safe as long as it is called in libraries code intended to be invoked in a particular way.
stdio is definitely one option.
Another one would be to have it not be a lazy_static and pass it down via arguments; so you can ensure its provenance, and then declare that binaries using this need to basically declare that they are unsafe when called in ways other than as a child process. This would be a major change though.
Could also be managed by sharing memory with the child process or something but seems suboptimal.
Sorry for coming late to the party, but yes, you seem to understand the situation. The communications mechanism is inspired by stdin and stdout, but doesn't use them. Originally we always used the next descriptors values after stdout, stdin, stderr, but IIRC we switched to passing the FD values through env variables because we didn't trust our ability to control them, especially during unit tests.
This approach is both notionally and internally simpler than the named socket dance that we do when we don't have the parent-child relationship. No permissions to try to get right, no chance of connecting to the wrong process, etc.
As for why we don't use stdin and stdout: stdout in particular is dangerous because it's easy to have spurious stdout messages from libraries or whatnot, especially if you are starting with a large existing python code base.
Regarding checking that we are being run as expected, the relevant env variables should be a good way to judge if actually being run by FS, and if the env variable aren't set, I think you can just crash. If you want additional confirmation, maybe run fcntl
F_GETFL
or a similar syscall to check the fds before using them.
The way we instantiate
std::fs::File
objects from file descriptors passed by Fleetspeak is a violation of safety invariants as we cannot know whether it is truly valid. We should somehow verify that the descriptor is indeed valid.