Open Enselic opened 2 years ago
I'm worried that this creates a weird coupling between the language and standard library. The #[unix_sigpipe]
attribute is not really a language attribute, it's an attribute to configure the standard library's initialization code, resulting in this weird sigpipe: u8
argument being passed to lang_start
(on all platforms).
I'd really like to see a future where, on stable Rust, lang_start
can be defined by other crates than just std
. But that means we'd have to stabilize its signature, which will now have to include this very specific sigpipe
argument. Also, such a crate would not have the possiblity to expose its configuration in the same way.
Can we think of a better (less 'magical') way to pass configuration to the runtime init function? (The simplest possibility is just a regular function call (like std::env::inherit_sigpipe()
at the start of main()
), but that would only happen too late, after the init code already disabled sigpipe.)
The simplest possibility is just a regular function call (like std::env::inherit_sigpipe() at the start of main()), but that would only happen too late, after the init code already disabled sigpipe.
Does that matter? Except for pre-main code causing a panic (which would write to stderr), the only way to actually trigger SIGPIPE is to read/write a broken pipe yourself, which you would do after calling inherit_sigpipe()
.
I was going to say that inherit_sigpipe
has no way to go back to the inherited state, because that was lost when std
called sigaction
. But actually, if std
saves the original signal state in a global, then I guess inherit_sigpipe
could just restore whatever it was (either SIG_DFL
or SIG_IGN
). That should be equivalent, right?
So it's kind of silly to have to call sigaction
twice to get the same behavior as not calling it as all, but maybe it's an acceptable tradeoff to avoid having this weird attribute and parameter to lang_start
.
So it's kind of silly to have to call
sigaction
twice to get the same behavior as not calling it as all,
Yeah, exactly.
but maybe it's an acceptable tradeoff to avoid having this weird attribute and parameter to lang_start.
Maybe. I really don't like the idea of a minimal Rust program doing more syscalls than necessary, but I also don't like the magical sigpipe attribute.
Does that matter? Except for pre-main code causing a panic (which would write to stderr), the only way to actually trigger SIGPIPE is to read/write a broken pipe yourself, which you would do after calling inherit_sigpipe().
if
std
saves the original signal state in a global, then I guessinherit_sigpipe
could just restore whatever it was (eitherSIG_DFL
orSIG_IGN
). That should be equivalent, right?
Not really. Any other process can send SIGPIPE by calling kill(1) or kill(2) directly at any time. If the target Rust program received the signal between the two syscalls, it would survive the signal which is supposed to kill the process.
Is there any legitimate reason to send SIGPIPE to another process? The use case this attribute was introduced for is killing the process when writing to a broken pipe.
Not sure, but if you implement standard utilities like ls
and cat
and want them to be 100% compliant to POSIX, you're not allowed to change any inherited signal disposition.
Not sure, but if you implement standard utilities like
ls
andcat
and want them to be 100% compliant to POSIX, you're not allowed to change any inherited signal disposition.
That seems like an extreme point of view.
But I suppose you're right.
Default Behavior: When this section is listed as "Default.", or it refers to "the standard action for all other signals; see Utility Description Defaults " it means that the action taken as a result of the signal shall be one of the following:
The action shall be that inherited from the parent according to the rules of inheritance of signal actions defined in the System Interfaces volume of POSIX.1-2017.
When no action has been taken to change the default, the default action shall be that specified by the System Interfaces volume of POSIX.1-2017.
The result of the utility's execution is as if default actions had been taken.
A utility is permitted to catch a signal, perform some additional processing (such as deleting temporary files), restore the default signal action (or action inherited from the parent process), and resignal itself.
Modified idea: instead of specifying SIG_IGN
, std could register a SIGPIPE
handler that does something like received_sigpipe = true
. When inherit_sigpipe
is called, it would switch back to the original disposition. After this, if received_sigpipe
is true, then it would call raise(SIGPIPE)
. That way, the signal is not lost, just delayed, which is allowed by the specification.
This has the added advantage that we no longer need to remember to revert SIGPIPE
to SIG_DFL
when execing a process, since this happens automatically for caught signals. This would be an observable change for programs calling exec without using std::process
, though, since they get SIG_IGN
today.
fn lang_start()
and sigpipe: u8
I fully support the long-term goal of allowing any crate to define fn lang_start()
. And I agree we can't have the sigpipe: u8
argument on fn lang_start()
if we stabilize its signature for that purpose.
As long as there is only one fn lang_start()
to choose from, I think the best approach is to parameterize its SIGPIPE
behavior.
However, if one of many fn lang_start()
can be chosen, we don't need any sigpipe: u8
parameter any longer. Instead, different implementations can be chosen. I imagine it would work similar to how choosing panic strategy works, where we currently have both
libpanic_abort-84c045833028a0b3.rlib
and libpanic_unwind-5c5363659220970d.rlib
in the sysroot, and which one that is chosen is controlled by the codegen flag -Cpanic=...
.
Similarly, I think we could add support for something like -Cruntime=sigpipe_kills
which would link against liblangstart_sigpipe_sig_dfl-625a57f.rlib
. With such a mechanism, we would not need any sigpipe: u8
parameter in the fn lang_start()
signature.
Edit: After thinking some more on the next steps, I think the natural next step is to get rid of the sigpipe: u8
parameter to fn lang_start()
already now. I will look into it.
Separately from above is the question of the UI for this. Should we use an attribute on fn main()
or only a compiler flag?
When this was discussed previously the preference was to have an attribute so that the source code on its own defines the behavior of the program as much as possible.
But if you prefer a compiler flag instead @m-ou-se I would be happy to change to that. Would you prefer me to change it?
Could you have the Unix lang_start
call some function rust_setup_sigpipe
, defined with weak linkage to call signal(SIGPIPE, SIG_IGN)
? And then have the attribute redefine it with strong linkage to be a no-op?
If it works, it would:
lang_start
immediately.(By the way, here I'm assuming that we only stabilize an attribute that skips all SIGPIPE
configuration altogether ("inherit"), rather than stabilize a SIG_DFL
option. Anyone asking for SIG_DFL
really wants "inherit". And if I'm wrong about that, well, you they always get SIG_DFL
on their own if they have "inherit". But the reverse is not true--if Rust doesn't offer "inherit", then there's no way to get it after the fact.)
For anyone else seeing this on nightly:
error: cannot find attribute `unix_sigpipe` in this scope
--> src/main.rs:137:3
|
137 | #[unix_sigpipe = "sig_dfl"]
| ^^^^^^^^^^^^
error[E0635]: unknown feature `unix_sigpipe`
--> src/main.rs:1:12
|
1 | #![feature(unix_sigpipe)]
| ^^^^^^^^^^^^
It seems like this feature is gone and is replaced with a compiler flag: https://github.com/rust-lang/rust/pull/124480
Note that when doing a web search for these language features or compiler flags, it's not obvious which version of the Unstable Book you're looking at.
Nightly: https://doc.rust-lang.org/nightly/unstable-book/compiler-flags/on-broken-pipe.html Beta: https://doc.rust-lang.org/beta/unstable-book/language-features/unix-sigpipe.html
Turns out making this a compiler flag is also not great, at least in the context of build systems like rustc bootstrap itself that build a ton of stuff. It's non-trivial to figure out when to set and when to not set this compiler flag, and a lot of time has been spend tweaking this logic in bootstrap and it's still wrong -- leading to a whole sequence of issues of the form "when I do A and then B, bootstrap rebuilds more than it should", often caused by some difference in RUSTFLAGS, such as this flag. It makes a lot more sense to annotate this in the program that wants the behavior (e.g. an attribute of the main function, or a std API invoked early in main) than in the build system that has to build tons of different programs that may want different values for this flag.
Also conceptually, this is a library thing, not a compiler thing, so having it be a compiler flag seems odd.
See here for more.
I agree this is a library problem and not a compiler or language problem. The current solution I envision involves adding libraries named something like
libsigpipe_error-91d1108bf6191a61.rlib
libsigpipe_kill-4b832a03827ff95e.rlib
to the sysroot. Similar to how we already have
libpanic_abort-91d1108bf6191a61.rlib
libpanic_unwind-4b832a03827ff95e.rlib
So choosing how SIGPIPE will behave means choosing what library to use.
What I can't currently envision is how we could allow choosing what SIGPIPE library to use without introducing a new compiler flag like -Con-broken-pipe=...
, similar to -Cpanic=...
that already exist. But there might be a solution.
Why does it have to be a compile-time decision at all? Why can't we have std::os::unix::set_sigpipe_behavior(...)
to be invoked early in main
?
Because SIGPIPE is setup before main
is invoked. So there'd be no way to change SIGPIPE disposition 0 times. It would be changed two times. Which
Note that that workaround can already be used (and is already quite widely used), except the helper function is not provided by libstd.
Well at some point you pick a tradeoff. Personally, I don't think the complexity of a compiler switch is justified here. Maybe one day we will have a lang/compiler feature that can carry this, rather than needing a bunch of special support just for this feature. For instance, it seems to me this could be done with this "overrideable defauly fn" mechanism that is intended to generalize panic handlers and the global allocator.
But the status quo is actively painful for rustc development, as noted above.
Note that that workaround can already be used (and is already quite widely used), except the helper function is not provided by libstd.
No. There is no stable workaround today. The correct behavior for Unix tools that want to behave like Unix tools is to inherit the signal disposition from the parent process. Rust throws away the original disposition, and it cannot be recovered.
I still think a reasonable tradeoff is to save the original disposition and provide some way of restoring it via a function in early in main(). It's not perfect (there's a race window where an externally-injected SIGPIPE will be ignored), but it's a hell of a lot simpler than the perfect solutions. I don't think the seccomp issue is a real one; it seems perfectly reasonable to say that Rust programs must be able to call sigaction
.
(I also think we should do a better job of steering users toward "inherit" in the docs. Right now, the docs recommend "kill" for programs producing textual output. This is not the default behavior for C (which is "inherit"), it is not a standard convention/recommendation for C programs (programs typically either explicitly ignore SIGPIPE or do nothing and so inherit the disposition), and it is not aligned with POSIX requirements for standard inbox tools (which requires that tools inherit the SIGPIPE disposition). I don't think Rust has any good reason to recommend against existing convention here.)
What about having libstd mask off SIGPIPE by default rather than disabling is entirely and then provide a function to restore the SIGPIPE mask to the original value? If there is a SIGPIPE injected before the restore function is called, it will still trigger once SIGPIPE is unmasked by the restore function. All tools that want to be compatible with the unix behavior have to do then is call this restore function at the top of main
.
What about having libstd mask off SIGPIPE by default rather than disabling is entirely and then provide a function to restore the SIGPIPE mask to the original value? If there is a SIGPIPE injected before the restore function is called, it will still trigger once SIGPIPE is unmasked by the restore function. All tools that want to be compatible with the unix behavior have to do then is call this restore function at the top of
main
.
If the restorer is not called, then SIGPIPE would remain masked indefinitely, which would be strange. The mask would be inherited across fork and exec, thus exporting this strangeness to child processes, even non-Rust ones.
Also, any Rust program currently working around this issue by calling signal(SIGPIPE, SIG_DFL)
would be re-broken by this.
The mask would be inherited across fork and exec, thus exporting this strangeness to child processes, even non-Rust ones.
That also happens with disabling the signal rather than masking it, right?
The mask would be inherited across fork and exec, thus exporting this strangeness to child processes, even non-Rust ones.
That also happens with disabling the signal rather than masking it, right?
Yes, I didn't mean to imply that signal dispositions are not inherited. Rather, by switching to masking SIGPIPE
instead of setting its disposition to SIG_IGN
, the child will still inherit the block, but now in an observably different and more unusual way, which may require the (potentially non-Rust) child to handle the situation differently. For example, using sigaction
to install a custom signal handler or reset it to SIG_DFL
would no longer work properly. And even if the child does go the extra mile and unmask the signal, it now risks getting killed by a SIGPIPE
that was triggered by its parent and has already been duly handled as EPIPE
.
I'd say that @jstarks has already put forward the correct, idiomatically POSIX solution in https://github.com/rust-lang/rust/issues/97889#issuecomment-2010074982
↓↓↓↓ Important ↓↓↓↓
The ui changed from an attribute to a compiler flag: https://github.com/rust-lang/rust/pull/124480, so the below description is out of date. Someone (maybe me) should update the description.
↑↑↑↑ Important ↑↑↑↑
The feature gate for the issue is
#![feature(unix_sigpipe)]
. It enables a newfn main()
attribute#[unix_sigpipe = "..."]
.Usage
Any simple Rust program that writes a sizeable amount of data to stdout will panic if its output is limited via pipes.
To prevent panicking we can use the new attribute:
More Info
Please refer to the unstable book section for more details. In short:
#[unix_sigpipe = "..."]
sig_ign
SIGPIPE
handler toSIG_IGN
before invokingfn main()
. Default behaviour since 2014.sig_dfl
SIGPIPE
handler toSIG_DFL
before invokingfn main()
.inherit
SIGPIPE
handler untounched before enteringfn main()
.The problem with the current
SIGPIPE
code in libstd as well as several other aspects of this problem is discussed extensively at these places:Naming convention
The naming follows the convention used by
#![windows_subsystem = "windows|console"]
where the values"windows"
and"console"
have the same names as the actual linker flags:/SUBSYSTEM:WINDOWS
and/SUBSYSTEM:CONSOLE
.The names
sig_ign
andsig_dfl
comes from the signal handler namesSIG_IGN
andSIG_DFL
.Steps
sigpipe::DEFAULT
distinct)#[unix_sigpipe = "sig_dfl"]
#[unix_sigpipe = "sig_ign"]
#[unix_sigpipe = "inherit"]
rustc_driver::set_sigpipe_handler()
#[unix_sigpipe = "inherit"]
that test that the disposition is actually inherited, rather than assuming SIG_DFL shall always be inherited.Unresolved Questions That Blocks Stabilisation
[ ] How can we make it easy to put
fn lang_start()
in an external crate that can be compiled with stable?[ ] We should rename the attribute and attribute values to things that reflect what they do rather than how they do it.
#[unix_sigpipe = "sig_dfl"]
#[unix_sigpipe = "sig_ign"]
SIG_IGN
, but arguably they should getSIG_DFL
since that is what most programs assume, and we explicitly made it that way before.exec
, see https://github.com/rust-lang/rust/pull/121578 for the trick.#[unix_sigpipe = "inherit"]
unchanged
?Unresolved Questions That Does Not Block Stabilisation
Because these questions can be resolved incrementally after stabilization.
SIGPIPE
, if we want to do it at all?Resolved Questions
-Z unix_sigpipe
flag instead, see https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Proposal.3A.20First.20step.20towards.20solving.20the.20SIGPIPE.20problem/near/285499895, at least not initially.sigpipe: u8
argument tofn lang_start()
on Unix platform viacfg
? Answer: No, this is not allowed, see top level comment in https://github.com/rust-lang/rust/blob/master/src/tools/tidy/src/pal.rssig_dfl
or isinherit
andsig_ign
sufficient? Answer: There are noteworthy examples of real projects that has opted to useSIG_DFL
to solve theBrokenPipe
problem. Notably rustc itself. So if we don't stabilizesig_dfl
, such projects can't make use of our new attribute. Therefore, we also need to stabilizesig_dfl
.Answer: It makes a lot of semantic sense to have the attribute on fn
main()
, because it is a way to configure what the Rust runtime should do beforefn main()
is invoked. For libraries, no entry point code that modifiesSIGPIPE
is generated, so allowing the attribute in these situations does not make much sense. See https://github.com/rust-lang/rust/pull/101077#issuecomment-1242972299 for small-scale discussion.lto
to remove the _signal stub code completely? With abool
it works (see https://github.com/rust-lang/rust/pull/97802#discussion_r893965842), but with the currentu8
we might need to do some tweaks. Answer: There are currently 4 values and I see no feasible way to reduce it to 2.fn lang_start()
is not relevant. And a stable Rust ABI is not even close (see https://github.com/rust-lang/rfcs/issues/600).MSG_NOSIGNAL
withsend()
etc instead of settingSIGPIPE
globally? Answer: No, because there is no equivalent forwrite()
, and it would incur an extra syscall for each write-operation, which is likely to have significant performance drawbacks.Disclaimer: I have taken the liberty to mark some questions resolved that I find unlikely to be controversial. If you would like me to create a proper discussion ticket for any of the resolved or unresolved questions, please let me know!
About tracking issues
Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.
@rustbot label +T-libs