python / cpython

The Python programming language
https://www.python.org
Other
63.42k stars 30.37k forks source link

asyncio `proc.kill()` and `proc.wait()` are counter intuitive #119710

Open piwicode opened 5 months ago

piwicode commented 5 months ago

Bug report

Bug description:

On Linux proc.wait() does not return after proc.kill() when there as sub/sub processes. This only happens when the standard streams are piped, it looks like wait() blocks until the pipes are closed.

import asyncio
import os
import time
async def main():
    proc = await asyncio.create_subprocess_exec(
        "/usr/bin/python3", "-c", "import os;os.system('sleep 20')",
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE)

    time.sleep(1)
    os.system("ps -f --forest --format pid,ppid,pgid,sid,comm")

    print(f">> This python proces pid={os.getpid()} has a `python3` child process, which has a `sleep` child process.")
    print(f">> Kill the `python3` child process pid={proc.pid}")
    proc.kill()

    os.system("ps -f --forest --format pid,ppid,pgid,sid,comm")
    print(f">> The `python3` child process was killed, the `sleep` process gets orfaned.")
    print(f">> Calling `proc.wait()` or `proc.communicate()` does not return until `sleep` process exits.")

    await proc.wait()

asyncio.run(main())

I don't know if it's a documentation issue or a bug.

CPython versions tested on:

3.11

Operating systems tested on:

Linux

vstinner commented 5 months ago

It's a surprising feature. communicate() waits until stdout (and stderr) is closed. If a child process spawns "sleep 20", the "sleep 20" inherits stdout and keeps it open for 20 seconds.

Maybe you should create a new process group and kill the whole process group, rather than only killing the "parent" process (parent of "sleep 20"). See the start_new_session parameter of subprocess.Popen and then use os.killpg().

That's what I did in test.libregrtest to make sure that CTRL+C kills immediately all processes, and not only direct child processes.

piwicode commented 5 months ago

Indeed the process groups can help, if the child process does not create another process group. But that is another story.

Here my point is that we should clarify The documentation ton avoid any surprise. The same behavior applies to proc.wait().

For the record, the crurent documentation says "Wait for child process to terminate" which looks missleading to me.

vstinner commented 5 months ago

wait() and communicate() are very different.

piwicode commented 5 months ago

Indeed. This issue is about proc.wait() documentation. I updated the code snippet to male it clearer.

benzea commented 3 weeks ago

I really consider this a pretty bad bug. It is absolutely impossible to wait for the executed process. I specifically wanted to wait for the launched process and not the pipe to be closed. Unfortunately, with the current API that is pretty much impossible.

There is not even a timeout= parameter for wait(), so I am now resorting to having a second task that specifically closes the pipe using proc.stdout._transport._protocol.pipe.close() (after killing the process just in case). I consider this a really nasty workaround, as I just need to wait for the process to quit. However, adding a second child watch is also not really possible …

Considering we have the mess, maybe one could:

  1. Properly document the wait() behaviour
  2. Fix Process.wait() to not immediately return if the process already quit (BaseSubprocessTransport._wait needs to test self._finished not self._returncode to be consistent)
  3. Add a Process.waitpid() function with the expected behaviour that does not wait for the pipes to be closed
  4. Maybe add a close() function that can be used to explicitly close all pipes. Though I suppose that will happen anyway when the object is deleted.