Closed rayzchen closed 2 years ago
Unfortunately, this requires foreknowledge that the process will manually enable Ctrl+C -- or trickery (e.g. injecting a DLL).
It's probably not worth going to all that trouble for this use case - it means shipping a separate DLL and doing some slightly tricky stuff to inject it, and this is intended to be a simple launcher, after all!
It's probably not worth going to all that trouble for this use case - it means shipping a separate DLL and doing some slightly tricky stuff to inject it, and this is intended to be a simple launcher, after all!
I was just discussing the functionality in general. There's no reason for the launcher to create a new process group and proxy CTRL_C_EVENT
and CTRL_BREAK_EVENT
via GenerateConsoleCtrlEvent()
. The launcher and the child should be in the same process group. Creating a new process group is up to the launcher's parent.
The launcher should return TRUE
for the cancel and break events. For the close, logoff, and shutdown events, I like the idea of waiting on the process handle of the child, without a timeout. Either the child will exit gracefully, or the system will forcefully terminate both the launcher and the child after the relevant timeout (i.e. "HungAppTimeout", "WaitToKillTimeout", or "WaitToKillServiceTimeout").
The same control handler should be set for both the console and GUI launchers. The launcher should be designed such that it will always receive CTRL_LOGOFF_EVENT
and CTRL_SHUTDOWN_EVENT
in an interactive session. This means avoiding shell APIs that indirectly load "user32.dll". The simple launchers in distlib have no problem here. The full py launcher, however, calls SHGetFolderPathW()
to get the user's local application data directory, which links to "shell32.dll", which loads "user32.dll". It's reasonable to simply use GetEnvironmentVariableW(L"LOCALAPPDATA", ...)
. If the environment variable isn't defined, fall back on RegGetValueW(HKEY_CURRENT_USER, L"Software\\Microsoft\\Windows\\CurrentVersion\\Explorer\\User Shell Folders", L"Local AppData", RRF_RT_REG_SZ | RRF_RT_REG_EXPAND_SZ, ...)
.
I was just discussing the functionality in general.
The background info you provide is interesting, thanks for sharing.
The launcher should return TRUE for the cancel and break events. For the close, logoff, and shutdown events, I like the idea of waiting on the process handle of the child, without a timeout.
Do you mean without waiting on the child in the Ctrl-C/Ctrl-Break case? Why would they be treated differently from the other cases?
This stuff seems like it should be simple, but isn't, to me at least. For example, this control key handler:
static PROCESS_INFORMATION child_process_info;
#define DELAY_FOR_CHILD_EXIT 5000
static BOOL
control_key_handler(DWORD type)
{
if (type == CTRL_C_EVENT) {
GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0);
}
/*
* See https://github.com/pypa/pip/issues/10444
*/
WaitForSingleObject(child_process_info.hProcess, DELAY_FOR_CHILD_EXIT);
return TRUE;
}
behaves differently if you omit the GenerateConsoleCtrlEvent
call. With the call included, you get this:
C:\Users\Vinay\Projects\simple_launcher>test\test 10
3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]
['C:\\Users\\Vinay\\Projects\\simple_launcher\\test\\test.exe', '10']
c:\python38\python.exe
Press Ctrl-C to exit:
Ctrl-C seen, cleaning up (should take 10 secs) ...
10 steps to go ...
9 steps to go ...
8 steps to go ...
7 steps to go ...
6 steps to go ...
Traceback (most recent call last):
File "C:\Users\Vinay\Projects\simple_launcher\test\test.exe\__main__.py", line
16, in <module>
KeyboardInterrupt
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\Vinay\Projects\simple_launcher\test\test.exe\__main__.py", line
21, in <module>
KeyboardInterrupt
^C
C:\Users\Vinay\Projects\simple_launcher>
So the cleanup takes 10 seconds but the launcher returns TRUE after 5 seconds, causing the child to terminate after 5. So far, so good. If you comment out the if statement with the GenerateConsoleCtrlEvent
call, however, you get this behaviour:
C:\Users\Vinay\Projects\simple_launcher>test\test 10
3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]
['C:\\Users\\Vinay\\Projects\\simple_launcher\\test\\test.exe', '10']
c:\python38\python.exe
Press Ctrl-C to exit:
Ctrl-C seen, cleaning up (should take 10 secs) ...
10 steps to go ...
9 steps to go ...
8 steps to go ...
7 steps to go ...
6 steps to go ...
5 steps to go ...
4 steps to go ...
3 steps to go ...
2 steps to go ...
1 steps to go ...
Cleanup done.
C:\Users\Vinay\Projects\simple_launcher>
So in both cases, the Ctrl-C is seen by the child and launcher, and in both cases the launcher would have returned TRUE after 5 seconds. However, in the second case the child is allowed to run to completion, and in the first it's killed after the TRUE is returned. Why the difference in behaviour, and since the Ctrl-C event is seen by both launcher and child regardless of the GenerateConsoleCtrlEvent
call, it appears to not be needed purely to send the event to the child - so what's really going on? :confused:
Do you mean without waiting on the child in the Ctrl-C/Ctrl-Break case? Why would they be treated differently from the other cases?
There is no need to wait in the cancel and break events. If the control handler returns TRUE
for these events, the control thread exits immediately without consequence. The process does not get terminated.
if (type == CTRL_C_EVENT) { GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0); }
This sends a cancel event to all processes in the console session -- including all ancestors, descendants, and the current process. That it gets sent to the current process leads to an endless loop of new control threads started in all processes in the console session, unless the current process happens to get killed by an ancestor due to the cancel event.
OK, I'll go with
static BOOL
control_key_handler(DWORD type)
{
if ((type == CTRL_C_EVENT) || (type == CTRL_BREAK_EVENT)) {
return TRUE;
}
WaitForSingleObject(child_process_info.hProcess, INFINITE);
return TRUE;
}
Which allows the child to complete cleanup after a Ctrl-C.
A cleaner solution than waiting would be to simply modify the job object. We know that the child process must either exit or get terminated after a close, logoff, or shutdown event, so the kill-on-close flag no longer matters. For example:
static BOOL
control_handler(DWORD type)
{
switch (type) {
case CTRL_CLOSE_EVENT:
case CTRL_LOGOFF_EVENT:
case CTRL_SHUTDOWN_EVENT:
// Allow the child to outlive the launcher, to carry out any
// cleanup for a graceful exit. It will either exit or get
// terminated by the session server.
JOBOBJECT_EXTENDED_LIMIT_INFORMATION info;
if (job_handle && QueryInformationJobObject(
job_handle, JobObjectExtendedLimitInformation,
&info, sizeof(info), NULL)) {
info.BasicLimitInformation.LimitFlags &=
~JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE;
SetInformationJobObject(
job_handle, JobObjectExtendedLimitInformation,
&info, sizeof(info));
}
}
return TRUE;
}
Also, consider my suggestion to proxy the child's WaitForInputIdle()
event. This event allows a parent that's looking for a particular window (for messaging or UI automation) to wait until the child is ready. If the launcher never creates a window, its input idle event is never set, and the parent might wait indefinitely. Implementing this with proper close and shutdown/logoff support requires an invisible top-level window with a window procedure and message loop. The window procedure needs to handle WM_QUERYENDSESSION
, WM_ENDSESSION
(logoff/shutdown), and WM_CLOSE
. For WM_CLOSE
, it should post WM_CLOSE
to each of the child's top-level windows and message-only windows, which is the way to request a graceful exit. For example, it's what taskkill.exe /pid <pid>
does. For WM_ENDSESSION
, just remove the job's kill-on-close flag to let the child exit on its own.
Also, consider my suggestion to proxy the child's WaitForInputIdle() event.
I've already done that, I think, as per your earlier suggestion.
static void
clear_app_starting_state(PROCESS_INFORMATION* child_process_info) {
MSG msg;
HWND hwnd;
PostMessageW(0, 0, 0, 0);
GetMessageW(&msg, 0, 0, 0);
/* Proxy the child's input idle event. */
WaitForInputIdle(child_process_info->hProcess, INFINITE);
/*
* Signal the process input idle event by creating a window and pumping
* sent messages. The window class isn't important, so just use the
* system "STATIC" class.
*/
hwnd = CreateWindowExW(0, L"STATIC", L"PyLauncher", 0, 0, 0, 0, 0,
HWND_MESSAGE, NULL, NULL, NULL);
/* Process all sent messages and signal input idle. */
PeekMessageW(&msg, hwnd, 0, 0, 0);
DestroyWindow(hwnd);
}
I call this just after calling CreateProcessW
and SetConsoleCtrlHandler
.
Keep in mind the differences in behavior pointed out if user32 is loaded. It sounds like you'd need to keep a window around to receive messages that won't come via the console ctrl handler anymore, and possibly proxy them to child windows.
A cleaner solution than waiting would be to simply modify the job object. We know that the child process must either exit or get terminated after a close, logoff, or shutdown event, so the kill-on-close flag no longer matters.
I am kind of concerned about some program running the launcher and waiting on its process, not realizing it is a launcher, and expecting that when it exits it is "done". Would that impact this use-case?
Keep in mind the differences in behavior pointed out if user32 is loaded. It sounds like you'd need to keep a window around to receive messages that won't come via the console ctrl handler anymore, and possibly proxy them to child windows.
That's what I was discussing in my previous message. My initial approach to the WaitForInputIdle()
problem was too simple minded. I hadn't considered supporting logoff and shutdown cleanup in an interactive session. If the GUI launcher is connected to a window station, then it needs to create an invisible top-level window (not a message-only window) with a window procedure that handles WM_ENDSESSION
. If it's going that far already, it would be nice to also support proxying WM_CLOSE
to the child via EnumWindows()
(the child's top-level windows) and FindWindowExW()
(the child's message-only windows). This will enable graceful termination of both the launcher and the child via taskkill.exe /pid <launcher pid>
.
I am kind of concerned about some program running the launcher and waiting on its process, not realizing it is a launcher, and expecting that when it exits it is "done". Would that impact this use-case?
The wait can be retained for the console's CTRL_CLOSE_EVENT
. Windows appears to call CtrlRoutine()
in each process sequentially and exclusively in reverse attachment order. Thus the child's handler executes to completion before the launcher's handler is called. That said, if the child breaks free of the console session via FreeConsole()
, it would be nice for it to be really free instead of still linked to the lifetime of the original console session via the launcher's job object. Maybe for this case the launcher should remove the kill-on-close flag and try to wait for the child until the launcher gets terminated by Windows.
The larger problem that I wish to avoid is that Windows shuts down applications sequentially during session logoff and system shutdown. A parent process might get CTRL_LOGOFF_EVENT
or WM_ENDSESSION
before its child does depending on their relative shutdown priority. In this case, if the parent is waiting on the child, Windows will view the parent as a hung application and terminate it. I tested this shutdown order problem by increasing the parent's shutdown priority via SetProcessShutdownParameters(0x3FF, 0)
. In practice it could be that the child lowers its priority to delay shutdown, or that they have the same priority but are called out of order depending on when they attach to a window station. For the GUI test, I made the parent process handle WM_ENDSESSION
by waiting for the child (notepad.exe) and logging the child's exit status, but the child doesn't exit before the parent because it hasn't been sent the logoff message yet. The console test is similar except the parent waits for the child (timeout.exe) in its control handler function instead of a window procedure. IMO, since the session is ending, it's simpler for the launcher to just detach itself from the child by removing the job's kill-on-close flag and skip the wait.
I think it would be useful to separate out scenarios that need to be considered by the GUI launcher from those that need to be considered by the console launcher. I'm wary of making things in the launcher too complex for the intended use case - which ISTM are nearly always console scripts. Even when the GUI launcher is used, I doubt whether the child application will consider all of these issues around shutdown as thoroughly as all this, if at all!
Even when the GUI launcher is used, I doubt whether the child application will consider all of these issues around shutdown as thoroughly as all this, if at all!
GUI toolkits should make it easy. For example, Qt (i.e. PyQt) handles WM_ENDSESSION
as an aboutToQuit
signal that can be connected to a cleanup function. See qwindowscontext.cpp.
I think it would be useful to separate out scenarios that need to be considered by the GUI launcher from those that need to be considered by the console launcher. I'm wary of making things in the launcher too complex for the intended use case - which ISTM are nearly always console scripts.
I think the launcher should try its best to let the child handle cleanup in all cases. That means not letting the launcher's job object forcefully terminate the child. If the launcher's control handler can be depended on to always get called after the child's handler has executed, then the launcher's CTRL_CLOSE_EVENT
can just wait on the process, get its exit code, and call exit(rc)
. I would not depend on a particular order for the logoff and shutdown events, however. In this case, it's reasonable to remove the job's kill-on-close flag and exit with exit(0)
. The session is ending, so exiting with the child's exit code doesn't matter.
If "user32.dll" is loaded, the launcher won't get CTRL_LOGOFF_EVENT
or CTRL_SHUTDOWN_EVENT
in an interactive session. For example, "user32.dll" is loaded if the launcher proxies the child's WaitForInputIdle()
. If it doesn't have a top-level window, the launcher will just get terminated without notice, and the job object will in turn terminate the child. In this case a hidden window, message loop, and window procedure are needed in order to handle WM_ENDSESSION
. A side benefit, however, is that once we already have a window procedure, we have the basis to support a graceful exit via WM_CLOSE
, proxied to the child. As is, without a top-level window to close, taskkill.exe and the GUI task manager only support forcefully terminating the launcher, which in turn forcefully terminates the child via the job object.
Is there anything I can help you with, e.g. test a proposed solution?
@cbrnr Can you try with the latest launchers from the distlib
repo to see if they resolve the issue? Thanks.
Will do! Just to make sure I do it correctly, do I install the latest distlib
master and then build with pip wheel
? Or is there anything else I need to update?
I'm not sure that'll work, because pip
vendors distlib
rather than having it as a normal dependency. I would clone the distlib
repo, then copy the .exe
files into the corresponding place in a pip
installation. For example, if you create a venv at c:\Temp\foo
then that would be the directory c:\Temp\foo\Lib\site-packages\pip\_vendor\distlib
- you can overwrite the .exe
files in there. Then you can try installing the above test application and invoking hello.exe
, or other windowed application of your choice.
I can confirm that with the latest distlib
master my GUI program starts normally 🚀! I did try your test application but no window appeared (but also no error message), but it did work with MNELAB!
When will these distlib
changes get vendored into pip
?
The latest release of distlib
on PyPI (0.3.4, released on 8 December 2021) should have these executables. Please double check with the executables from that release and let me know if there are any issues. I don't know if pip
has already vendored this version of distlib
- I expect a pip
maintainer will chime in to answer your question.
The test application would need pywin32
installed in the venv so that it can display a window - I'm not sure if you'd done that.
@vsajip the latest release v0.3.4 also works, so I guess the vendored distlib
could be updated to this version.
The test application would need pywin32 installed in the venv so that it can display a window - I'm not sure if you'd done that.
Yes, this was probably the issue (I didn't install pywin32
).
Unfortunately distlib 0.3.4 (introduced in pip 22) has severe bug and creates faulty launchers for windows. Now sys.stderr is None
if you start launchers created by pip 22. See:
see also
To make meson work again on windows:
python -m pip install -U pip==21.3.1
python -m pip install meson
@vsajip, wild guess: maybe it happened here?
https://bitbucket.org/vinay.sajip/simple_launcher/commits/4c263ce78a4b6b30274d877780e9525ac2f5c15e
+#if defined(CLEANUP_LAUNCHER_HANDLES)
+ CloseHandle(hOut);
+ /* We might need stderr late, so don't close it but mark as non-inheritable */
+ SetHandleInformation(hErr, HANDLE_FLAG_INHERIT, 0);
+#endif
+ ok = safe_duplicate_handle(hErr, &si.hStdError);
+ assert(ok, "stderr duplication failed");
+ si.dwFlags |= STARTF_USESTDHANDLES;
I think you probably cannot generalize to all launchers for Windows. I just tried pip install mnelab
and the launcher works (it didn't prior to v22).
https://github.com/mesonbuild/meson/issues/9955#issuecomment-1030843844
So the best mitigation right now is to use the pip version according your needs. pip 22 for mne and pip 21 for meson.
@vsajip, wild guess: maybe it happened here?
https://bitbucket.org/vinay.sajip/simple_launcher/commits/4c263ce78a4b6b30274d877780e9525ac2f5c15e
+#if defined(CLEANUP_LAUNCHER_HANDLES) + CloseHandle(hOut); + /* We might need stderr late, so don't close it but mark as non-inheritable */ + SetHandleInformation(hErr, HANDLE_FLAG_INHERIT, 0); +#endif + ok = safe_duplicate_handle(hErr, &si.hStdError); + assert(ok, "stderr duplication failed"); + si.dwFlags |= STARTF_USESTDHANDLES;
I could see that. If stdout and stderr were set to the same handle, then closing hOut could result in hErr now being an invalid (closed) handle. It might be as simple as
if (hOut != hErr)
CloseHandle(hOut);
t might be as simple as
if (hOut != hErr) CloseHandle(hOut);
Indeed, that's worth trying. I'm out of the country for a little while - will get to it when I get back.
Hmmm. See detailed analysis of this and related problems by Michael Bikovitsky here.
I don't follow why it is that the launcher needs to duplicate new inheritable handles, with new handle values, instead of just ensuring that the current handles are inheritable. After spawning the child, close any handle that isn't needed.
The process standard handles have to be made inheritable because they aren't necessarily so at startup. It can happen that a process is started with standard handles that are valid but not inheritable. There's a particular case for which the system manually duplicates (not inherits) the parent's standard handles to the child process, with new handle values but with the same access and attributes as in the parent, such as whether each handle is inheritable. This occurs when all of the following criteria are met: the spawned executable is a console application (i.e. IMAGE_SUBSYSTEM_WINDOWS_CUI
); handle inheritance is disabled (i.e. bInheritHandles
is false); the standard handle values are not set explicitly (i.e. no STARTF_USESTDHANDLES
); and the child is not detached from the current console session (i.e. no DETACHED_PROCESS
, CREATE_NEW_CONSOLE
, or CREATE_NO_WINDOW
). For example:
>>> kernel32.GetStdHandle(-10)
560
>>> os.set_handle_inheritable(560, False)
>>> subprocess.call('python -q')
>>> # child process
>>> import os, ctypes
>>> kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
>>> kernel32.GetStdHandle(-10)
8
>>> os.get_handle_inheritable(8)
False
Thus the launcher needs to ensure that each standard file handle is inheritable. For example:
static BOOL
make_handle_inheritable(HANDLE handle)
{
DWORD file_type = GetFileType(handle);
// Ignore an invalid handle, non-file object type, unsupported file type,
// or a console file prior to Windows 8.
if (file_type == FILE_TYPE_UNKNOWN ||
file_type == FILE_TYPE_CHAR && (handle & 3)) {
return TRUE;
}
return SetHandleInformation(handle, HANDLE_FLAG_INHERIT,
HANDLE_FLAG_INHERIT);
}
Since the above requires GetFileType()
to succeed with a supported file type, the SetHandleInformation()
call shouldn't fail unless something seriously fails in the OS. An invalid handle or a handle for an unsupported object type or unsupported file type is passed silently without an error.
Then in run_child()
just before calling CreateProcessW()
:
HANDLE hStdInput = GetStdHandle(STD_INPUT_HANDLE);
HANDLE hStdOutput = GetStdHandle(STD_OUTPUT_HANDLE);
HANDLE hStdError = GetStdHandle(STD_ERROR_HANDLE);
if (si.dwFlags & (STARTF_USEHOTKEY | STARTF_UNDOC_MONITOR) == 0) {
ok = make_handle_inheritable(hStdInput);
assert(ok, "making stdin inheritable failed");
ok = make_handle_inheritable(hStdOutput);
assert(ok, "making stdout inheritable failed");
ok = make_handle_inheritable(hStdError);
assert(ok, "making stderr inheritable failed");
si.hStdInput = hStdInput;
si.hStdOutput = hStdOutput;
si.hStdError = hStdError;
si.dwFlags |= STARTF_USESTDHANDLES;
}
After CreateProcessW()
, close any handle that isn't required. For example:
// The launcher doesn't require stdin and stdout, so close them.
CloseHandle(hStdInput);
CloseHandle(hStdOutput);
@eryksun Shouldn't the launcher also close whatever handles there are in lpReserved2
after CreateProcessW
?
@mbikovitsky, a process inherits an arbitrary set of kernel-object handles from its parent. The handle values can be passed to the child in various ways, which have to be known ahead of time, such as command-line arguments or in environment variables. The C runtime library uses the PEB's ProcessParameters->RuntimeData
(i.e. lpReserved2
in the WinAPI STARTUPINFO
record). This isn't an officially documented structure, though it's well known.
I'd prefer a more generic approach to the problem of identifying handles to close. The most direct solution is for Windows 8.1+, which provides PssCaptureSnapshot()
with PSS_CAPTURE_HANDLES
and PssWalkSnapshot()
with PSS_WALK_HANDLES
. The launcher can call GetFileType()
for each handle. If it's a pipe file handle, close it, with the exception of hStdErr
. I'm only concerned about pipes, for which a leaked handle will keep the pipe open.
How safe is it to close all pipes in the process? For a given pipe we can't know whether it was inherited from the parent - maybe it's being used by some internal OS stuff. If we close it and another handle gets opened in its place we run the risk of data corruption. At the very least, whatever code was using that pipe in the first place will fail by working with a nonexistent handle, and might even crash the process. It's only the launcher, sure, but it's still bad.
Even if we assume that closing all pipes is safe, we still have to parse lpReserved2
, since the fds there might refers to the pipes we're closing.
I guess what I'm proposing is this:
lpReserved2
array, close all fds there.hStdInput
and hStdOutput
, if we haven't yet closed fds that map to these handle values.Sure, this won't work for handles the parent process is passing on the command-line or by some other means, by it's safer. And hey, the current code doesn't deal with handles passed from the parent in a "non-standard" way, and it seems to work fine, so why change that? We can always fix it later :)
How safe is it to close all pipes in the process? For a given pipe we can't know whether it was inherited from the parent - maybe it's being used by some internal OS stuff.
I don't know of any pipes for "internal OS stuff" that would be relevant to the launcher. The base Windows API has extensive IPC to the session server process (csrss.exe), but that uses an ALPC port.
Even if we assume that closing all pipes is safe, we still have to parse
lpReserved2
, since the fds there might refers to the pipes we're closing.
There's a small chance of a problem. The console launcher uses stderr
, which uses standard file descriptor 2. The system C runtime library implements _dup2()
, _open_osfhandle()
, and close()
to keep the standard file descriptors in sync with the process standard handle values, but this is only implemented for a console application. If a GUI process dupes a file descriptor to fd 2, the C runtime won't automatically update the standard-error handle value. This poses a problem if the process subsequently uses spawn*()
to execute a console launcher.
Alternatively the console launcher could be modified to use the process standard error handle via WriteFile()
instead of calling fprintf(stderr, ...)
or fwprintf(stderr, ...)
. For a wide-character format string, use WriteConsoleW()
if it's a console file, else use WriteFile()
and encode text using CP_ACP
, CP_THREAD_ACP
(initially the ANSI code page of the current user's default locale, not the system locale), or CP_UTF8
.
This poses a problem if the process subsequently uses
spawn*()
to execute a console launcher.
As far as I can see, spawn*
passes the standard fds to the child process, so using stderr
is still fine as long as we don't close it.
So something like this:
lpReserved2
array, close all fds there except for fd 2.This is simpler than using WriteFile
and WriteConsole
.
My main concern is that closing all pipe handles directly, as you suggest, might leave some fds with dangling handles. So when these fds are closed you might get an exception (see here). That's why I keep coming back to parsing lpReserved2
.
What about patching distutils as found here https://github.com/pypa/pip/issues/10444#issuecomment-1030812695 with a proposed patch here https://github.com/pypa/pip/issues/10875#issuecomment-1032312251 which was successfully (locally) tested with GUI as well with cli scripts here https://github.com/pypa/pip/issues/10875#issuecomment-1032516439. Most likely this patch needs more extensive testing, however, no one cares about since 4 weeks.
As far as I can see,
spawn*
passes the standard fds to the child process, so usingstderr
is still fine as long as we don't close it.
I was referring to the speculative suggestion to close all inherited pipe handles except for the process standard error handle. In the scenario I outlined, with a GUI parent process, fd 2 and the process standard error may refer to different handles. I suggested ignoring that problem, and just using the process standard error handle with WriteFile()
or WriteConsoleW()
. Another approach would be to call _get_osfhandle(_fileno(stderr))
to get the handle that needs to be kept open. Then the launcher could continue to use fprintf(stderr, ...)
and fwprintf(stderr, ...)
.
might leave some fds with dangling handles.
C runtime file descriptors don't get closed at exit. Any associated file handles are closed when the process object is rundown in the kernel.
@carlkl If I understand correctly, this patch simply adds
if (hOut != hErr)
CloseHandle(hOut);
If so, then it still suffers from the bug described in microsoft/vscode-python#18561.
@eryksun
I was referring to the speculative suggestion to close all inherited pipe handles except for the process standard error handle. In the scenario I outlined, with a GUI parent process, fd 2 and the process standard error may refer to different handles.
Thanks for clarifying that.
C runtime file descriptors don't get closed at exit. Any associated file handles are closed when the process object is rundown in the kernel.
Yes, I see that now (here, here), but I'd still feel a little safer if we were to close the fds explicitly and not rely on this behaviour. Unless the fds being silently dropped on process termination is documented somewhere that I missed?
If I understand correctly, this patch simply adds
if (hOut != hErr) CloseHandle(hOut);
If so, then it still suffers from the bug described in https://github.com/microsoft/vscode-python/issues/18561.
@mbikovitsky, yes - that's all. I would suggest testing this with black (https://github.com/microsoft/vscode-python/issues/18561).
This patch definitely helped with meson (CLI) as well as with GUI scripts (tested with mne, mnelab). There are so much issues related now to https://github.com/pypa/pip/issues/10444 as well as https://github.com/pypa/pip/issues/10875 I can only bet someone takes this patch (proposed by @jeremyd2019 btw.) for further testing.
Unless the fds being silently dropped on process termination is documented somewhere that I missed?
I expect it's assumed. There's nothing that needs to be done to close out a low I/O fd at exit, unlike a FILE
stream, which may need to be flushed to disk. On platforms for which kernel file descriptors are returned by open()
, I don't think it's expected that the runtime will track them all and close them at exit.
A different issue with the launcher's inherited file handles is the share mode and unlinking behavior of regular files and directories. For every open, a filesystem tracks whether the open has read/execute, write/append, and delete/rename access and whether it shares each type of access. A new open has to request compatible access and sharing with existing opens. If existing opens of a file don't share delete access, then DeleteFileW()
and MoveFileExW()
will fail with a sharing violation. Even if DeleteFileW()
successfully 'deletes' an open file, it remains linked, in a state that disallows new opens, until all existing opens have closed. (Note that Windows 10 supports a POSIX-like delete for NTFS filesystems, which is implemented by immediately moving a deleted file to a hidden system directory if the file is open. Delete access still needs to be shared, however, so true POSIX-like behavior depends on the cooperation of applications.)
As far as I know, no one has reported an issue with the py launcher or simple launcher regarding sharing violations or problems with deleting files. Like the case of a pipe not closing, this is speculation about long-running, complex interaction with a Python-based tool. Such scenarios are not the normal use case for the launcher. The normal case is a command-line tool that uses standard I/O, or launching a GUI app, not to launch a tool that's tightly integrated in a long-running, multi-process workflow.
On platforms for which kernel file descriptors are returned by
open()
, I don't think it's expected that the runtime will track them all and close them at exit.
Sure, just like the Microsoft CRT doesn't track every open handle. It's the kernel's job to do that. But then, on Windows the fd to handle mapping is an internal CRT thing, so I'd expect them to clean it up on process termination. They do free the memory for this mapping, but not the handles themselves, for some reason. Even then, they only do so in debug builds. So I don't know what to expect anymore :)
@carlkl As I'm saying, the proposed patch doesn't solve the issue with VSCode and Black. VSCode passes 3 different handles for the standard I/O streams, so the added condition in the patch doesn't do anything new.
The normal case is a command-line tool that uses standard I/O, or launching a GUI app, not to launch a tool that's tightly integrated in a long-running, multi-process workflow.
Indeed, as a user of pip I would expect that problems with these usual use cases should be handled with priority. And the problem with https://github.com/microsoft/vscode-python/issues/18561 doesn't fit into the problems originally described in https://github.com/pypa/pip/issues/10444 or https://github.com/pypa/pip/issues/10875. This is starting to get pretty confusing - at least in my opinion.
Closing all inherited file handles or parsing the C runtime data to close all inherited file descriptors isn't an urgent problem to solve. However, the launcher should continue to close the standard files, because skipping that would probably cause problems, especially when stdin and stdout are pipes. Closing the standard files shouldn't interfere with the parent's use of the C [_w]spawn()
functions. The simple way to avoid problems is to get rid of safe_duplicate_handle()
, so as to keep the "standard file descriptor" <-> "standard handle" relationship exactly as the launcher inherited it, and also delay closing the standard files until after CreateProcessW()
is called.
The launcher could also close the standard error file. Its debug messages shouldn't contaminate the error stream after spawning the child. Error messages can instead be written directly to the console. For example, initialize FILE *ferror = stderr
, and change assert()
and wassert()
to use ferror
. After spawning the child, open console output as a non-inheritable file without buffering: ferror = fopen("CONOUT$", "rt+N")
; setvbuf(ferror, NULL, _IONBF, 0)
. Currently this only matters if SetConsoleCtrlHandler()
or GetExitCodeProcess()
fails. Actually, I don't think it should matter for SetConsoleCtrlHandler()
, which can be called before spawning the child. (The child_process_info
record is static initialized to zero values, so the control handler can skip the wait when hProcess
is NULL
.)
Could anyone test a fixed version of the launchers? https://bitbucket.org/mbikovitsky/simple_launcher/downloads/handle-dup-fix.zip
.exe
files into the corresponding place in a pip
installation. For example, if you create a venv at C:\Temp\foo
then that would be the directory C:\Temp\foo\Lib\site-packages\pip\_vendor\distlib
.The updated code from which these were built is here.
@mbikovitsky, regarding the use of undocumented and unsupported System[Extended]HandleInformation
, even if Vinay allows that, I'm certain it won't be allowed in the CPython py launcher. I think if it's implemented at all, closing inherited pipe handles should be supported only for Windows 8.1+, via process snapshotting. This API also captures the handle table in a more efficient and direct way: NtQueryInformationProcess()
to get the ProcessHandleCount
and the ProcessHandleTable
(Windows 8.1+).
For closing fds, I'm concerned that lpReserved2
could be bad data. The invalid parameter handler should be set in order to prevent aborting the process if it's an invalid fd. CPython uses a macro that enables a silent handler for the current thread.
regarding the use of undocumented and unsupported
System[Extended]HandleInformation
... I'm certain it won't be allowed in the CPython py launcher. I think if it's implemented at all, closing inherited pipe handles should be supported only for Windows 8.1+, via process snapshotting.
Fair enough. I removed the whole thing, as the Pss*
API is a lot more complex, and I don't think it's worth the effort at this point. The code now closes only stdin
, stdout
, and the Windows standard I/O handles.
For closing fds, I'm concerned that
lpReserved2
could be bad data. The invalid parameter handler should be set in order to prevent aborting the process if it's an invalid fd.
Good idea. I implemented this now.
I suggest we move this discussion to the pull request, so as not to spam here.
I'm grateful that Michael Bikovitsky has proposed a patch for the launchers and the corresponding built launchers are here. Can interested parties please confirm whether these modified launchers solve the problems reported here and in #10875 - with both CLI and GUI launchers ?
I can confirm that the GUI launcher works, I tested with pip install mnelab
followed by mnelab
(which gives an error with the current pip
release).
I could also test the CLI launcher if you tell me which program did not work before.
FWIW, I'm testing in a Git for Windows Bash shell.
Can you please quickly reiterate how I can test the patched launcher? Where do I need to copy the .exe files?
The linked zip file contains some .exe files - just copy them over the files in <venv>\Lib\site-packages\pip\_vendor\distlib
for whichever pip you're testing with. That directory should already have the .exe files released with whichever version of distlib
is vendored into that specific pip
installation.
I can confirm that the GUI launcher works
Great, thanks for the feedback.
I could also test the CLI launcher if you tell me which program did not work before.
Examples are (a) meson and (b) Black with VSCode, mentioned above in this thread.
FWIW, I'm testing in a Git for Windows Bash shell.
I hope that people can also try in Powershell, cmd.exe
etc. in case the shells differ in their behaviour regarding stdio handles.
I will test the launchers with scipy-meson build. Just give me some time today as I can't test it right now.
I tested meson
and it seems to work, I did:
~ pip install meson
~ meson
ERROR: Must specify at least one directory name.
I never had any problems launching apps in Powershell or cmd.exe
, but I tested the patched launchers anyway. Both meson
and mnelab
launch without any errors in both shells.
I can confirm that with the proposed launchers from https://github.com/pypa/pip/issues/10444#issuecomment-1088356138 copied on top to the newest deployed pip I can successfully run meson build --prefix=$PWD\build
on scipy-meson. This is on Windows 10 with the msys2 environment. (The msys2 environment is needed for this task)
EDIT: I also want to mention that I uninstalled meson and tqdm beforehand and reinstalled it with the patched pip launchers.
setuptools version
setuptools==56.0.0
Python version
Python 3.9
OS
Windows
Additional environment information
No response
Description
I use
gui_scripts
with a blank function. If i usepip install -e .
orpy setup.py install
nothing happens, as expected. If i dopip install .
or use setup.py to make a binary dist then install it, i getstderr duplication failed
in a prompt. This happens before any code gets run.Expected behavior
Nothing
How to Reproduce
Run above commands
Output
Code of Conduct