Closed faern closed 7 years ago
Thanks for doing all this work! Some design thoughts. I have a few worries about exposing a terminate
method:
kill
or terminate
in the most common case. (Though there might actually be some debate about what the right answer is here >.< )terminate
is hard to describe to folks not familiar with signals, because there are so many exceptions. In particular, it can't guarantee that the child will actually exit in a reasonable amount of time. I think the correct way to use terminate
in most cases is to call it, give the child a second to quit, and then kill
it if it hasn't quit by then. But I expect the vast majority of callers will actually use terminate
by itself, even when they shouldn't.terminate
is that it gives delicate apps a chance to clean themselves up and save state before exiting. But even that isn't guaranteed, because of Windows. If we expect people to reach for terminate
specifically when they care about cleanup, this might be a portability footgun.So I'm leaning towards only providing the signal
function, and (as you have it now) only on Unix. It covers all the same use cases, but it makes the caveats above much more explicit. What do you think?
Side question, what are the benefits of using an extension trait vs just a #[cfg(not(windows))]
gated method? I know the stdlib only uses traits, but I don't know why.
I agree with your concern about confusion over terminate
. It would make much more sense if there was a cleaner way to terminate processes on Windows as well. From what I have heard, one can send a ctrl+c equivalent event to a process and hope it responds to that. What do you think about such a solution? There is no guarantee that a process will honor that, but the same holds for SIGTERM on unix so...
Yes, depending on the use case, one probably want to call terminate
, have a timeout, then call kill
if the process is still running.
I can remove the terminate
method if we don't feel it's justified at the moment, just adding the possibility to signal on unix gives the same power to the user, but with some extra code on their side.
About Ext
trait vs conditional method: In stdlib all those extension traits have paths like std::something::unix::FooExt
making it clear they are for unix only. Also in the docs the platform specific functions will be grouped together at the bottom under the trait impl section. Personally I feel it would be much more confusing if they suddenly disappeared from the normal list of methods on some platforms. With that solution one would have to clearly write in the docs for the method that it only exists on unix. With this solution you would get an import error when trying to import the trait on Windows.
To enable the terminate -> timeout -> kill workflow it might be useful to expose a wait_timeout
method as well. I'm not sure if there is a system call that allows that to be implemented in a non-polling manner on unix. The subprocess
crate seems to use polling on unix and win32::WaitForSingleObject
on Windows.
A combination of these could be to expose a terminate(Duration)
that does the above, but all built in. The state after calling that would be guaranteed to be the same as after a call to kill
, only difference is that the process would be given a duration for graceful exit first.
We could create another SharedChildExt
in the windows module as well that would expose a generate_console_ctrl_event(CtrlEvent)
that would make use of kernel32::GenerateConsoleCtrlEvent
to send either ctrl+c or ctrl+break.
With these two ext traits we could have the common terminate
(that would use either ctrl+c or SIGTERM). Thoughts on this?
EDIT: Ok, that was not as straight forward as I had hoped. Sending those events is messy.
In stdlib all those extension traits have paths like
std::something::unix::FooExt
making it clear they are for unix only.
Got it, that makes sense.
kernel32::GenerateConsoleCtrlEvent
Very interesting, I didn't know about that one. Do you know whether there are any gotchas? (Like, if I create a simple GUI app with all the defaults in Visual Studio or whatever, is it going to exit when it gets the console event? Or does it have to add special handling or be in console mode or anything like that?)
We could also go with a signals on Unix only + ctrl event on Windows only approach, especially if the behavior of the two was different enough to be confusing. Though I think child handling is a little easier on Windows in general, because you don't have the same race conditions, so we wouldn't necessarily be adding much there besides a safe wrapper for the win32
functions.
To enable the terminate -> timeout -> kill workflow it might be useful to expose a
wait_timeout
method as well.
I looked into this sort of thing in the very beginning, before I knew about libc::waitid
at all, as another way to avoid the PID race. As far as I can tell, there's no native support on Unix for anything in between "block forever" and "don't block at all", and everyone implements this by looping with sleeps. It's kind of sad that SIGCHLD exists and solves this problem, but library code can't use it. (Windows does support this timeout though. Honestly I respect the Windows kernel a lot more now having done this project.)
This sleep loop runs into one of those weird-edge-cases-that-make-me-feel-icky though: You might expect wait_timeout(huge_duration)
to be equivalent to wait()
, but it isn't. They're similar at first, but after time passes and the sleeps get longer, there can be a big delay between when your child process exits and when you actually wake up to check it again. A lot of callers probably don't care about that difference, but if your child process is serving requests or something, it might be really bad to leave it dead for ten seconds before you notice it crashed.
So kind of like with terminate
, I feel a little skittish about exposing wait_timeout
as a simple-and-correct-sounding method. Especially since the most common use case is probably better implemented by what this library already supports: wait
in one thread and then sleep once and kill
in another thread.
No I don't have any more info or experience with GenerateConsoleCtrlEvent
. After reading the different answers on https://stackoverflow.com/questions/813086/can-i-send-a-ctrl-c-sigint-to-an-application-on-windows I feel it's very messy. One has to attach the console of another process to our process and then generate the event etc. I know very little about process management in Windows and I don't have a very good environment for testing it.
How about this PR focuses on stabilizing the unix signal extension only and merge that. Then we can start experimenting on Windows and see what we come up with.
Btw. Any reason the library uses #[cfg(not(windows))]
instead of #[cfg(unix)]
?
#[cfg(unix)]
pub mod unix;
Would feel more natural. But I did not want to introduce that without asking, since you did not use it before.
The only reason is that I didn't know what would happen on weirdo platforms like Redox, and I figured we might as well say "if it isn't Windows, try the Unix thing". Is there any downside to doing it that way?
This looks great. Just tried it out by hand. Merging now.
Closes #12
A proposal for how to solve #12.
terminate
available everywhere