skeeto / w64devkit

Portable C and C++ Development Kit for x64 (and x86) Windows
The Unlicense
2.66k stars 185 forks source link

[busybox] alias should probably disable interrupts #140

Open avih opened 2 weeks ago

avih commented 2 weeks ago

While using my own cmd.exe wrapper to $COMSPEC (as a workaround for the autotools cmd //c ... issue), I noticed that ctrl-c interrupts the wrapper but not the wrapped cmd.exe, which results in the wrapper exiting while $COMSPEC keeps running (at least when launching cmd.exe in interactive mode).

This is bad, because the calling process which invoked the wrapper thinks it's done, while in fact the wrapee keeps running.

A ctrl-c event is dispatched to all the processes attached to the current console, and not only to the current "foreground" process.

I don't have an explicit test case for any of the existing aliases in w64dk (either to busybox.exe or the other utils), but I'm pretty sure it can happen.

One case comes to mind which doesn't necessarily exit immediately on ctrl-c is make (specifically if using mingw32-make.exe which is an alias), at which case the alias can exit while make is still running, which is bad, for instance because its output prints while ash is back at the prompt (if invoked from an interactive sh).

Another such case could be interactive (busybox) sh.

Symptoms of such case is "shell limbo" where, after ctrl-c terminated the alias but not the wrapee, key presses go both to the wrapee and to the shell, but in a weird way and neither is functional until something kills the wrapee manually from elsewhere.

I don't know for a fact that the w64dk [busybox] alias suffer from this, but I think it does because I don't identify at the code any measures to handle this case.

To solve it, in my cmd.exe wrapper I initially used SetConsoleCtrlHandler(NULL, TRUE) which is dcumented as:

If the HandlerRoutine parameter is NULL, a TRUE value causes the calling process to ignore CTRL+C input, and a FALSE value restores normal processing of CTRL+C input. This attribute of ignoring or processing CTRL+C is inherited by child processes.

However, this has two bad issues:

  1. As documented, it's inheritable, which means that the wrapee would not be interruptible with ctrl-c unless tey explicitly set a handler of their own - which many apps don't do (but cmd.exe specifically apparently does).
  2. It only handles ctrl-c, but not other signals, like ctrl-break (on some KB ctrl-pause), so ctrl-break also exits the wrapper but not the wrapee (at least with cmd.exe wrapper).

The solution, apparently is to use an actual handler instead of NULL, similar to this:

BOOL WINAPI sighandler(DWORD dwCtrlType)
{
    return TRUE;
}

In my tests this indeed disables ctrl-c and ctrl-break from killing the wrapper, while not affecting the wrapee. I.e. the wrapee by default is still interruptible normally.

Consequently, the wrapper exits if and only if the wrapee exits, after WaitForSingleObjects and using an explicit exit.

This specific handler ignores all control events, which include, in addition to ctrl-c and ctrl-break, also close-window, logoff (of some unspecified user), and shutdown (supposedly only to services).

I think that's OK, because according to the docs, after CTRL_CLOSE_EVENT, CTRL_LOGOFF_EVENT, or CTRL_SHUTDOWN_EVENT, the process would be killed regardless if the handler returned TRUE or FALSE, so it's only an opportunity for cleanup before returning.

FYI.

skeeto commented 2 weeks ago

Thanks, this is useful information! I haven't had experience dealing with SetConsoleCtrlHandler before. After a few tests I could reliably reproduce a shell limbo situation. I wrote a wrappee that does ignores CTRL-C:

SetConsoleCtrlHandler(0, 1); for (;;) { puts("test"); Sleep(100); }

If I then wrap this using the current alias.c, pressing CTRL+C puts it into shell limbo. Bingo.

The solution, apparently is to use an actual handler instead of NULL

That looks right to me.

It seems there's a race condition inherent in the API. By ignoring CTRL-C before CreateProcess, there's a small window where CTRL-C inputs are lost even if normally it would terminate. If instead we set a null handler just after CreateProcess returns, an early CTRL-C isn't lost, but there's a brief window that may create shell limbo, which is worse.

If the handler, say, sets an atomic so we could detect that a CTRL-C had occurred, we still couldn't determine if it happened before CreateProcess, and whether it was delivered to the child. We would need an API to create a process and change the CTRL handler atomically. (If only CTRL-C delivery could be temporarily masked…)

There's also the matter that the child may be intending to ignore CTRL-C, but hasn't set it up yet, but that's not a big deal. It's not unreasonable that a quick CTRL-C may terminate a process intending to ignore it after initialization.

I have a candidate ready for these changes: c30b715. I want to test it for awhile before merging, both myself and anyone else interested in providing feedback.

w64dk

That's it! I'm officially adopting this as a shorthand.

avih commented 1 week ago

I haven't had experience dealing with SetConsoleCtrlHandler before

Same, though apparently busybox-w64 does use it in some cases, which I noticed only after I already added it in my cmd.exe wrapper.

After a few tests I could reliably reproduce a shell limbo situation. I wrote a wrappee that does ignores CTRL-C:

I used an almost identical test wrapee, including the Sleep(100) ;)

It seems there's a race condition inherent in the API. By ignoring CTRL-C before CreateProcess, there's a small window where CTRL-C inputs are lost even if normally it would terminate. If instead we set a null handler just after CreateProcess returns, an early CTRL-C isn't lost, but there's a brief window that may create shell limbo, which is worse.

Yes, but to be honest, I don't know which should be considered worse.

At least with the limbo you can, with some effort, kill the wrapee manually and you're good to go.

But ctrl-c getting ignored, especially with repeating short-lived processes - where presumably most of the time in alias is spent at the ("protected") CreateProicess, can lead to making it hard to kill with ctrl-c.

But then again, such repeating short-lived processes may be actually worse with the limbo scenario, so maybe yes, the limbo overall is worse (I also initially added the NULL handler after CreateProcess, then moved it to before, fot shis reason),

I have a candidate ready for these changes: c30b715. I want to test it for awhile before merging, both myself and anyone else interested in providing feedback.

I don't mind trying it out, but I didn't setup a build env of w64dk(!) itself, and I prefer not start doing that now.

Maybe you can create a new release and mark it as experimental or some such?

avih commented 1 week ago

I have a candidate ready for these changes: c30b715

I think you want it also in busybox-alias.c ?

avih commented 1 week ago

If the handler, say, sets an atomic so we could detect that a CTRL-C had occurred, we still couldn't determine if it happened before CreateProcess, and whether it was delivered to the child

True, but then we could also deliver it to the child ourselves using GenerateConsoleCtrlEvent regardless, and the child(ren) would do with that whatever it wants, but we'd know for a fact that ctrl-c and ctrl-break are never ignored.

This might have a negative side effect (if it was already delivered to the child) for apps which respond to ctrl-c with "press ctrl-c again to exit", but this would happen so quickly after startup that it's probably valid to say that it could have happened also before the child sets the ctrl-c handler.

Do note that the control handler runs in a different thread, so an atomic var would be fine, but more complex things would require care.

I have a hunch that there might still be a short window where ctrl-c happens before CreateProcess is complete (so the child doesn't get it), but the main thread already tested whether the atomic is set, so it doesn't realize it should re-broadcast it. E,g, maybe due to the handler getting called some time after the ctrl-c event actually happened due to event propagation etc.

But even if that can happen, my hunch is that it's a way smaller window than without doing such re-broadcase.

Maybe add Sleep(0) after CreateProcess (and before checking the atomic var) to ensure the atomic has a chance to change before we start WaitForSIngleObject?

Also, I don't know if that's possible and/or could help, but maybe creating the process suspended and then resuming it can help in making it more tight?

@rmyorston any thoghts if this can be handled hermetically? (on one hand ensure ctrl-c is never lost, but OTOH prefer not to deliver it twice)

avih commented 1 week ago

Hmm.. for a moment I thought CreateProcess with flag CREATE_NEW_PROCESS_GROUP would solve everything atomically without any additional effort, because the control events are dispatched to a process group, so a ctrl-c event would be dispatched either in our group if the child was not yet created (and no need to SetConsoleCtrllHandler), or to the new group if it was created.

However, the last line of the CREATE_NEW_PROCESS_GROUP docs state:

If this flag is specified, CTRL+C signals will be disabled for all processes within the new process group.

So unless we can re-enable that, or re-dispatch control events to the new group somehow so that they're not ignored, then it does exactly the thing we don't want it to do...

avih commented 1 week ago

Here's some empiric info:

Using this ctrl handler:

atomic_uint ctrl_events = 0;

BOOL WINAPI ctrl_handler(DWORD dwCtrlType)
{
    // ctrl-c is 0, ctrl-break is 1
    if (dwCtrlType < 2)
        ctrl_events |= 1 << dwCtrlType;
    return TRUE;
}

and this process creation:

    SetConsoleCtrlHandler(ctrl_handler, TRUE);
    Sleep(1000);
    if (CreateProcess(cmd, line, 0,0,0,0,0,0, &si, &pi)) {
        Sleep(1);
        if (ctrl_events & (1 << 0)) GenerateConsoleCtrlEvent(0, 0);
        if (ctrl_events & (1 << 1)) GenerateConsoleCtrlEvent(1, 0);
        WaitForSingleObject(pi.hProcess, INFINITE);
        ...

The Sleep(1000) is to allow a simulation of ctrl-c after we disabled it but before the child is created.

Pressing ctrl-c right after running the wrapper (inside the 1000ms sleep) indeed kills the child right after it's created.

Interestingly, removing the Sleep(1) (still with early ctrl-c inside the 1000ms sleep) doesn't kill the child (additional ctrl-c once the child is already runnign does kill it).

GenerateConsoleCtrlEvent does get called in that case (without sleep 1), but presumably it's too early and the child doesn't get killed - which I find weird because CreateProcess already returned, but the behavior seems to be consistent.

Also interesting, replacing Sleep(1) with Sleep(0) also doesn't kill the child on early ctrl-c...

I don't know whether the value 1 is always enough, but I'm guessing yes, because unlike Sleep(0), I think 1 is relinquishing the slice even if there are no other thread is pending.