Open joshtriplett opened 5 years ago
I'm not sure if the @rust-lang/libs team is the right team for this; suggestions welcome if another team would be the right owner for "behavior of Rust at application start-up".
If we stop ignoring SIGPIPE by default, we will break back compat for every socket server written in Rust, right?
Perhaps we should have a build-time option for this to make it easy to disable/enable in an application?
Why this and e.g. not an attribute on fn main() {
? (As a general rule, I think having program semantics defined as much as possible by the source code is preferable to flags since it makes things more standalone.)
@sfackler Right now, we have inconsistent behavior, where your socket server will also break if you ever write main()
in any language other than Rust, and turn your code into a library. That flies in the face of one of Rust's greatest strengths: "we don't have a runtime". I discovered this when looking over the pre-main startup code.
At the very least, we need to document that we ignore SIGPIPE, and we need to provide an easy way to disable this behavior. I'd also fully expect that any change to the default itself would require a sizable transition period (or even an edition).
@Centril Good point! An attribute within the source code (on main or the top-level module) does indeed seem preferable, rather than a build-time option.
I agree that we should do something about this. I don't think we can change the current default behavior because of backcompat concerns, as @sfackler mentioned, but the current status quo is that nominal command line applications start out of the gate as broken, and it's a fairly subtle issue to fix. Firstly, if you're using println!
and have enough output to exceed stdout's default buffer size, then it's likely that running, for example, command | head -n1
will result in a panic. Secondly, if you're using writeln!(...)?
instead, and are setting an error code when writeln!(...)
fails, then it's likely that command | head -n1
will stop with an unexpected exit code.
For example, normally a process will exit with a 141
exit code upon receipt of a pipe error:
$ seq 1 10000 | head -n1
1
$ echo ${pipestatus[1]}
141
So to fix this, you firstly need to know enough to use writeln!(...)
, and then you either need to assume all errors produced by writeln!(...)
are broken pipe errors (and emit the correct exit code), or explicitly check errors for io::ErrorKind::BrokenPipe
and respond accordingly. The former is, strictly speaking, incorrect, even though it's likely the only error you'll encounter when writing to stdout is a broken pipe error (but the code using write!
may not know that it is writing to stdout). The latter works, but it's tedious, and even I get it wrong:
$ rg . rust/ | head -n1
rust/bstr/rustfmt.toml:disable_all_formatting = true
$ echo ${pipestatus[1]}
0
This, presumably, should be 141
. But at least it doesn't panic.
So yeah, on the one hand, ignoring SIGPIPE by default means pure Rust networked applications probably have the right behavior by default, but not ignoring SIGPIPE by default means command line applications probably have the wrong behavior by default. Both of these groups seem fairly sizable to me.
I don't mind the idea of some incantation one can put in the source code to fix this. An alternative might be to just provide a function in std
, if it's possible, that un-ignores SIGPIPE and is intended to run at the beginning of main
. If that's workable, then that might be preferrable since it adds fewer directives to the language itself. The other issue here is documentation and discoverability. i.e., How can we help folks writing CLI tools discover, in the first place, that they need to un-ignore SIGPIPE?
(Although, I guess adding a function to std
and requiring users to call it doesn't fix @joshtriplett's case, where Rust is a library. Am I understanding that right?)
I personally agree with @sfackler that I don't think we can really change any defaults due to backwards-compatibility reasons. AFAIK this really only affects one thing which is unix-y CLI applications. Any portable CLI application to Windows cannot rely on the existence of SIGPIPE and almost all non-CLI applications I imagine want to receive errors instead of SIGPIPE to handle. That feels, to me, niche enough that we made the defaults the right way and there should be documentation/crates for authors writing unix-y CLI applications (aka ripgrep and such).
@alexcrichton I agree that we have to handle backwards-compatibility, but that wouldn't stop us from having a top-level attribute to opt-out of it.
If the proposal ends up boiling down to adding a new top-level attribute to opt-out, then I think a blocker in my mind for the proposal would be to justify why a change to the compiler/standard library is necessary when otherwise a crate on crates.io would only require two lines to add to a project (e.g. a dependency directive in Cargo.toml
and a one-liner in src/main.rs
to enable)
@alexcrichton Because there's a difference between "do some setup to ignore SIGPIPE, then un-ignore it" and "never ignore SIGPIPE in the first place". Because people want to build smaller, faster executables. Because libraries and applications shouldn't have subtle differences in behavior, especially undocumented differences. Because you can do this in C and Rust makes it difficult. Because not every project should need to have a dependency to do this. Because it's a pain to add a platform-specific dependency and a platform-specific one-liner for an application that could otherwise be portable, and #![no_ignore_sigpipe]
would be easier. Because it should be possible to write Rust code that doesn't have any startup code at all, and with some care we could potentially eliminate all of the current startup code in the future.
Er sorry that sort of feels like a particularly antagonistic reponse to me? Many of your "Because ..." can be pretty trivially fixed (the crate doesn't have to be platform specific, difficulty seems like it's a function of the crate, etc) and seem like they're maybe blowing the problem out of proportion (we're not slowing down executables or bloating them by adding an extra syscall at the beginning)?
I think one of the main points you're bringing up is the difference in library/binary behavior today, but from what @sfackler mentioned above and I agree we unfortunately can't change the binary behavior and library authors largely just need to be aware of this if they run into it (which thankfully isn't all the time). We can of course also have crates to smooth this over one way or another.
My point (which I don't think you really addressed?) was that I don't understand why a compiler solution is needed. It seems that a suitable crate can be added to crates.io to solve almost all of these issues.
My point (which I don't think you really addressed?) was that I don't understand why a compiler solution is needed. It seems that a suitable crate can be added to crates.io to solve almost all of these issues.
It's possible to make a crate that causes start
to never register a sigpipe handler at all? I thought that was pretty baked in?
If this can be reasonably supported as a library function then that seems like the best solution since it is low cost. At any rate if people absolutely don't want to take on that dependency then libstd is still lower cost than a built in attribute. All in all I think @alexcrichton makes fine points.
@alexcrichton My apologies, I certainly didn't intend to come across as antagonistic.
My most specific point is that a library crate can only undo what std does, it can't stop std from doing it in the first place. I'd like a solution that provides more control over what Rust's startup code is doing in the first place.
(Also, as one more reason, every unnecessary syscall also represents an additional syscall needed in seccomp filters to lock down a program.)
@OvermindDL1
It's possible to make a crate that causes start to never register a sigpipe handler at all? I thought that was pretty baked in?
Exactly. That's the main reason I'm asking after having built-in support.
Exactly. That's the main reason I'm asking after having built-in support.
I thought as much, it really should be in the compiler then. An attribute on main
seems sufficient, or even a compiler flag (exposed via something in the Cargo.toml file perhaps as well?).
This seems like a problem that could be fixed with the next edition? That would justify the use of an attribute to disable/enable this feature, in that upgrading binary crates to the new edition would need to add the attribute to re-enable this feature.
Changing any signal disposition by default in the startup code is inherently broken, because dispositions are inherited across fork/exec. The same goes for signal masking. IMO this behavior should just be removed.
If you really really don't want SIGPIPE
to happen, without applications having to take their own measures to stop it, there is a right way that does not clobber any state. For each write
or equivalent (note: send
and sendto
can just use sendmsg
as their backend with MSG_NOSIGNAL
flag), do the following:
SIGPIPE
.write
.EPIPE, call
sigwaitinfoto consume the
SIGPIPE` while it's still blocked.This procedure is entirely portable and thread-safe, because signal masks are thread-local and SIGPIPE
is generated synchronously for the thread, not for the process.
an alternate solution:
println! calls std::process::exit(141)
instead of panic!()
if it fails on EPIPE
that way, both servers and cli tools work by default
The issue of igoring SIGPIPE
is related to, but separate from, the issue of what println!
does on write failure. Changing signal dispositions behind the application's back in a way it has no way to fix up before exec
'ing something else (AFAICT the original disposition is lost) is a much bigger problem than what println!
does, since applications can just refrain from using println!
.
With that said, I agree std::process::exit(141)
would be less bad than panic!()
. The latter should be for programming errors, not for completely valid runtime conditions.
This came up in a StackOverflow question, and my advice was basically, "either don't use println!
in non-toy programs, or use this unsafe
FFI binding via libc to re-enable the default behavior for handling PIPE
signals."
println! calls std::process::exit(141) instead of panic!() if it fails on EPIPE
If you have multiple threads, this will abort the whole process instead of only panicking the current thread.
@jyn514 yes. That's exactly what you want in most CLI programs.
@BurntSushi Programs using println!
are not necessarily CLI programs, though. They may, for instance, be servers that use stdout/stderr for logging.
Another reason not to use std::process::exit(141)
is that it is not
the same as being killed by a SIGPIPE. Standard shells will report
“termination by signal n
” as return value n + 128
, but this is a
property of the shell, not the system (e.g., man bash
).
You can distinguish the two cases from the output of waitpid(2)
:
"exit 141" exited with code Some(141); signal: None
"echo sigpipe" exited with code None; signal: Some(13)
"kill -13 $$" exited with code None; signal: Some(13)
We can see that a bare exit
terminates with a code and no signal,
whereas being killed terminates with a signal and no code.
Further reading: “Proper handling of SIGINT/SIGQUIT” describes how this can be important. Choice quote: “You cannot "fake" the proper exit status by an exit(3) with a special numeric value, even if you look up the numeric value for your system” (emphasis in original).
This seems to have reached a political impasse but it doesn't seem like the following point was ever addressed? At least I don't see it in this thread.
My most specific point is that a library crate can only undo what std does, it can't stop std from doing it in the first place.
At the same time, I haven't seen strong reasons against including a mechanism for disabling this in the compiler, via a top-level attribute. Can we revisit this perhaps?
In https://github.com/rust-lang/rust/pull/97802 I have implemented one possible way to help solve this problem, by adding a fn main()
attribute called #[unix_sigpipe = "..."]
that we can decide a default value on in the future. The attribute is available in nightly-2022-09-04 and later. Any feedback is very welcome, including feedback that disqualifies my approach as a whole.
Tracking issue for unix_sigpipe
: https://github.com/rust-lang/rust/issues/97889
This issue comes up in every program in the book on Rust CLI programming that I'm writing and takes a lot of explanation. At the moment my "best" solution is to catch the BrokenPipe error and exit with a 141 status but that's still problematic for the reasons that others have listed. I'd prefer to have this as a flag for rustc which can be controlled from cargo. Encoding it into the program doesn't really make sense to me since this is behaviour that the compiler is invisibly adding. Ideally the rustc flag would default to not ignoring SIGPIPE, and then programs intended to be run as daemons etc can update their cargo settings to enable this behaviour. Those programs are much more likely to handle error conditions more comprehensively than simple CLI programs which can often just give up and bail out.
On Mon, Jun 06, 2022 at 19:29:20, Martin Nordholts @.***
wrote:
Please make sure to also read https://rust-lang.zulipchat.com/#narrow/ stream/219381-t-libs/topic/Proposal.3A.20First.20step.20towards.20solving. 20the.20SIGPIPE.20problem for significantly more discussion regarding this problem.
In #97802 https://github.com/rust-lang/rust/pull/97802 I have implemented one possible way to solve this problem. Any feedback is very welcome, including feedback that disqualifies my approach as a whole.
— Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/62569#issuecomment-1147754149, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPIUDC43Y6BI5GNBFITHDVNY7QBANCNFSM4H7TL6PA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I'd like to add that it seems this discussion is missing the reason SIGPIPE exists; https://stackoverflow.com/questions/8369506/why-does-sigpipe-exist does a good job of explaining it.
The point is /not/ to save you from having to check the exit status from write(). Rather, the point is to kill the whole pipeline immediately, even if some components are blocking on a read() (or doing other work). This allows work to be short-circuited.
It's not just a convenience for ignoring the return value of write, but rather is a whole different feature.
@rustbot labels +I-lang-nominated
Nominating as it may be worth discussing this apropos of the next edition.
I used to be in the "change default in a new edition" camp. After investigating more in depth, I have changed my mind. I now think we should keep the current default and use unix_sigpipe
to cover (almost) all use cases.
Roughly sorted by weight, with heaviest argument at the top.
It would make Rust code less portable across platforms by default. If we changed the default, a currently portable test like https://github.com/rust-lang/rust/blob/master/tests/ui/process/sigpipe-should-be-ignored.rs would end up non-portable and would need to have "if unix: // run-fail, if windows: // run-pass".
If SIG_IGN
is active in a program that needs SIG_DFL
, there are clear indications (BrokenPipe
) of what the problem is that makes it easy search for solutions. This makes unix_sigpipe
discoverable. If SIG_DFL
(or "inherit") is the default but SIG_IGN
is needed, the symptoms are much more subtle and harder to diagnose. The program just suddenly terminates without any textual output.
According to the 2023 survey Rust is used more for network/cloud programming than for writing textual command line tools. And networked code want SIG_IGN
.
The default has been around since 2014. We should not underestimate the churn a new default would create.
There will no longer be no way to get SIG_IGN
in self
and SIG_DFL
in child processes which code out there might rely on, unless we add another #[unix_sigpipe = ...]
variant or come up with some other scheme. According to https://github.com/rust-lang/rust/pull/25784 this is a legitimate use case.
For users writing unix-y command line apps, it is easy to discover and use #[unix_sigpipe = "sig_dfl"]
.
For users that want minimal seccomp filters, or for users that want to know the original SIGPIPE
disposition, or for users that on the basis of principle don't want the Rust runtime to change process-global state, it is easy to use #[unix_sigpipe = "inherit"]
.
For users that use a non-Rust main and links in Rust library code, we don't have an elegant solution I'm afraid. But since these will typically be C and C++ apps, it will be easy for them to adjust the SIGPIPE
signal disposition themselves. And by using a non-Rust main, those users are already in some sense on their own and have opted out of the nice and cozy pure Rust world.
I propose that we
#[unix_sigpipe = "sig_dfl"]
: https://github.com/rust-lang/rust/pull/120832SIG_IGN
. I think https://github.com/rust-lang/reference/pull/1465 is sufficient. Hmm maybe we should add some text on how it affects networked code though.#[unix_sigpipe = "sig_ign"]
SIGPIPE
before exec:ing child processes, even if SIG_IGN
is the default already. Edit: I added some open unresolved questions for this attribute in the tracking issue.#[unix_sigpipe = "inherit"]
after bike-shedding the name a bit more. Edit: I added that as an open unresolved question for this attribute in the tracking issue.This dates back to when Rust had a runtime and a green-threads implementation, and more magic handling of I/O.
I don't think the current default was made because of libgreen or magic handling of I/O. It seems to me as if it was made to make Rust code more cross-platform. Making the behavior more in line with libgreen was more of a bonus outcome worth a remark, it seems.
Rust currently resets the signal mask in a child process
The same goes for signal masking. IMO this behavior should just be removed.
The code that messed around with the signal mask was removed 14 months ago, so this is no longer a concern.
As long as Rust is changing signal dispositions inside init code in a way that the application cannot suppress or undo, it is fundamentally unusable to implement standard unix utilities that run child processes or anything that needs to preserve the signal dispositions it was invoked with and pass them on to children. Changing inheritable process state behind the application's back is just unbelievably bad behavior and does not belong in a language runtime for a serious language. If the current behavior is to be kept, there needs to be some way to suppress it or to obtain the saved original disposition at invocation so that the process can undo the change and pass unmodified dispositions on to its children.
As an example, if you implement find
in Rust, the -exec
option will invoke its commands with SIGPIPE
set to SIG_IGN
, so that they will not properly terminate on broken pipe. But if you just made it set SIGPIPE
to SIG_DFL
before invoking the commands, now it would be broken in the case where the invoking user intentionally set SIGPIPE
to SIG_IGN
so that the commands would not die on broken pipe.
I just realized - and I might have said this before, but since it looks like there's a risk of nothing being done to fix this, I should probably bring it up again:
You can ignore a signal safely in a way that doesn't break dispositions for children.
On entry, you first get the existing disposition. If it's SIG_IGN
, nothing to do. If it's SIG_DFL
, then you install a (non-interrupting, i.e. SA_RESTART
) no-op signal handler, which behaves exactly like SIG_IGN
except that, on exec, it will revert to SIG_DFL
instead of SIG_IGN
.
I still think it would be better not to mess with signal dispositions (or other inheritable process attributes) at all behind the application's back. But if Rust folks are set on doing this, there's a right safe way to be doing it instead of what Rust is doing now.
I hadn't realized until this very comment that Rust was messing up SIGPIPE for child processes also. That explains a long-running puzzle of why when I spawn a gpg process and pipe a bunch of GB of data through it, but only read part of it, the gpg process continues until it's consumed all of the input.
So there are two separate issues:
I would argue that the behavior in case 2 doesn't raise the same level of concern as changing SIGPIPE within the rust program does.
The arguments @Enselic raise above - I guess it boils down to whether you see Rust as something of a successor to C for writing system tools, or to Go or something for writing portable network apps. Of course it is actually both. There are some less-invasive options; for instance, as documented at https://blog.erratasec.com/2018/10/tcpip-sockets-and-sigpipe.html , using MSG_NOSIGNAL with send() and setsockopt(SO_NOSIGPIPE) for sockets. I would prefer the principle of least surprise -- after all, we don't try to make the filesystem on Windows case-sensitive in Rust or the Linux one case-insensitive. But I could live with Enselic's proposal, if we address the issue of SIGPIPE in processes spawned by Rust as discussed above.
As long as Rust is changing signal dispositions inside init code in a way that the application cannot suppress or undo, it is fundamentally unusable to implement standard unix utilities that run child processes
So there are two separate issues:
1 . What happens with SIGPIPE within the Rust program
- What happens with SIGPIPE in processes spawned by the Rust program
@richfelker @jgoerzen I suspect you have missed the existence and functionality of #[unix_sigpipe = "inherit"]
?
If it is used, the Rust runtime will
SIGPIPE
at all before fn main()
is calledSIGPIPE
at all before exec
ing child processes.If you still have concerns with this in mind, please let me know. I value your (and others) input a lot, because I want us to do the right thing here.
(#[unix_sigpipe = "inherit"]
might be renamed to make it clearer what it does. Maybe #[unix_sigpipe = "unchanged"]
is a better name for example.)
@richfelker @jgoerzen I suspect you have missed the existence and functionality of
#[unix_sigpipe = "inherit"]
?
Yes, I think so. It looks like this makes it possible to write programs that do the right thing entirely, which is good and fixes the show-stopping problem. I still think it's bad to be poking at things like this by default, however.
If there is still a strong desire to keep using this method to suppress SIGPIPE
, could Rust at least adopt the approach I recommended above, where, rather than switching the disposition to SIG_IGN
, you switch it to a no-op handler if it's not already SIG_IGN
?
This avoids any need to handle exec-time specially, but gives the same outward behavior as what you're already doing.
@richfelker @jgoerzen I suspect you have missed the existence and functionality of
#[unix_sigpipe = "inherit"]
?
No (though surfacing that to people is something that needs to be better at). I perhaps wasn't very clear. I was trying to say:
I hope that is more clear. Thanks for asking!
If there is still a strong desire to keep using this method to suppress
SIGPIPE
, could Rust at least adopt the approach I recommended above, where, rather than switching the disposition toSIG_IGN
, you switch it to a no-op handler if it's not alreadySIG_IGN
?
I tried it out which raised some questions. I have added them as unresolved questions for #[unix_sigpipe = "sig_ign"]
to the tracking issue. Let's continue discussing this over at the implementation PR.
Another argument against changing the default is that according to the 2023 survey Rust is used more for network/cloud programming than for writing textual command line tools. And networked code want SIG_IGN.
(I'm stumbling into this thread because one of my projects uses GLib, which has for many years changed SIGPIPE in the socket code for similar reasons and I also use Rust in others and was comparing what they do as part of a recent bug)
However, I didn't see anyone comment on this yet: I would probably say that we could require MSG_NOSIGNAL
existing and working on platforms we care about. It's already used by std. But of course, the big open question is how much Rust networking code out there not using std may not be passing that flag to libc/rustix calls. I'm not advocating changing the default...the blast radius seems potentially high, but it does seem worth a push to do an audit to ensure MSG_NOSIGNAL is in use in general.
If you really really don't want
SIGPIPE
to happen, without applications having to take their own measures to stop it, there is a right way that does not clobber any state. For eachwrite
or equivalent (note:send
andsendto
can just usesendmsg
as their backend withMSG_NOSIGNAL
flag), do the following:1. Block `SIGPIPE`. 2. Perform the `write`. 3. If it failed with `EPIPE, call `sigwaitinfo`to consume the`SIGPIPE` while it's still blocked. 4. Restore the old mask.
This procedure is entirely portable and thread-safe, because signal masks are thread-local and
SIGPIPE
is generated synchronously for the thread, not for the process.
This approach is interesting, but the issue with it is that you must now apply this to every single thing that calls write()
and might be hitting a pipe, even File
due to named fifos. And that wouldn't just be std
code but also libraries that roll their own IO via libc calls. Which also means now everything that does IO pays increased syscall costs (which have gone up due to CPU vulnerability mitigations).
All that just to keep some global state intact that maybe ought be inherited by child processes.
An equivalent to MSG_NOSIGNAL
for write()
would be much more palatable but that does not appear to exist.
Another problem with this is it is expensive; introducing multiple system calls for every call to write() -- which often occur in a tight loop -- is quite undesirable.
Back in 2014, the Rust startup code (which runs before main) started ignoring SIGPIPE by default: https://github.com/rust-lang/rust/pull/13158
This dates back to when Rust had a runtime and a green-threads implementation, and more magic handling of I/O. The pull request mentions that it "brings the behavior inline with libgreen".
This doesn't seem to be documented anywhere, as far as I can tell; at a minimum, this needs documentation.
But I'd like to question whether we should still do this by default. See https://github.com/rust-lang/rust/issues/46016 for some recent discussion of this. This changes the default behavior of pipes from what people might expect when writing UNIX applications. Rust currently resets the signal mask in a child process, but that's only if you use Rust to launch the child process rather than calling a library to do so. And in addition to that, signal handlers are process-wide, and people might not expect Rust to change process-wide state like this. This also means that Rust libraries will not have the same behavior as Rust applications.
Either way, this is something we need to document far better. Some people will want one behavior, and some people will want the other, so whichever we use as the default needs careful documentation and a documented procedure to change it.
EDIT (based on a suggestion from @Centril): Perhaps we could control this with an attribute on
main
or on the top-level module?