python / cpython

The Python programming language
https://www.python.org
Other
61.97k stars 29.8k forks source link

`KeyboardInterrupt` in `TaskGroup` task propagates to event loop scheduler #102572

Open mandolaerik opened 1 year ago

mandolaerik commented 1 year ago

Bug report

If raising KeyboardInterrupt in a task created within a TaskGroup, the exception seems to propagate to the asyncio scheduler:

import asyncio
async def runner():
    loop = asyncio.get_running_loop()
    fut = loop.create_future()
    async def coro():
        try:
            fut.set_exception(TypeError())
            await loop.create_future()
        except asyncio.CancelledError:
            raise KeyboardInterrupt()
    try:
        async with asyncio.TaskGroup() as tg:
            tg.create_task(coro())
            await fut
    except* KeyboardInterrupt as e:
        print('Got KeyboardInterrupt', repr(e))
    except* TypeError as e:
        print('Got TypeError', repr(e))
asyncio.run(runner())

Gives me:

Got KeyboardInterrupt BaseExceptionGroup('', (KeyboardInterrupt(),))
Traceback (most recent call last):
  File "/tmp/foo.py", line 14, in runner
    await fut
TypeError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/foo.py", line 19, in <module>
    asyncio.run(runner())
  File "/tmp/foo.py", line 12, in runner
    async with asyncio.TaskGroup() as tg:
  File "/usr/lib64/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 640, in run_until_complete
    self.run_forever()
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 607, in run_forever
    self._run_once()
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 1919, in _run_once
    handle._run()
  File "/usr/lib64/python3.11/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "/tmp/foo.py", line 10, in coro
    raise KeyboardInterrupt()
KeyboardInterrupt

One may argue that it makes a bit of sense that a KeyboardInterrupt propagates to the user, but if I swap the raising of KeyboardInterrupt and TypeError in my code so the KeyboardInterrupt comes from the main task, then the KeyboardInterrupt is captured cleanly. I find this behaviour inconsistent with the documentation that says "The same special case is made for KeyboardInterrupt and SystemExit as in the previous paragraph".

Your environment

CPython 3.11.1 on fedora 37 x86-64:

$ python --version
Python 3.11.1
$ uname -a
Linux ecarsten-mobl1 6.1.9-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb  2 00:21:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
arhadthedev commented 1 year ago

@1st1, @asvetlov, @gvanrossum, @graingert, @kumaraditya303 (as asyncio experts)

gvanrossum commented 1 year ago

Let me see if I can understand what's going on.

In the code as written, you create a task which marks the shared future with TypeError, and then waits forever by creating and waiting for a future that nobody ever marks as done.

The main thread, in the body of the async with TaskGroup() block, then awaits the shared future. This will raise the TypeError, so then we enter the task group's __aexit__ with the exception set to TypeError.

This will then cancel the task. The task catches the CancelledError, raising KeyboardInterrupt instead. (I'm assuming this simulates a genuine KeyboardInterrupt generated by ^C or SIGINT.)

The task is now done, and the task's _on_task_done callback is called. This finds that the exception is one of the "special exceptions" (KeyboardInterrupt and SystemExit) and sets the task group's _base_error instance variable.

Now __aexit__ finds that the task is done. Pretty much the first thing it does at this point (after exiting the while self._tasks loop) is to check self._base_error, and if it's set (which it is), raise it, without wrapping it in a BaseExceptionGroup.

Finally, and here I'm going to ask @iritkatriel for help, the except* handlers happen. For some reason it triggers the except* KeyboardInterrupt handler but when that exits it propagates the bare KeyboardInterrupt?!

I cannot explain this. It feels like a bug in the exception handling. I tried to reproduce it with a simpler program but it doesn't happen there.

```py import asyncio class C: async def __aenter__(self): return self async def __aexit__(self, *args): print("AEXIT", args) await asyncio.sleep(0.1) raise KeyboardInterrupt("AEXIT") async def main(): try: fut = asyncio.Future() async with C(): fut.set_exception(RuntimeError("BODY")) await fut except* KeyboardInterrupt as err: print(f"Caught {err!r}") asyncio.run(main()) ``` But this just prints ``` AEXIT (, RuntimeError('BODY'), ) Caught BaseExceptionGroup('', (KeyboardInterrupt('AEXIT'),)) ``` without also printing a traceback.
iritkatriel commented 1 year ago

I put a breakpoint on PyErr_Print and I see that it's called from _PyRun_SimpleFileObject:439, during cleanup because pyrun_file returned NULL.

iritkatriel commented 1 year ago

Ah, that doesn't help, it's because the KeyboardInterrupt escaped, but it doesn't tell us why.

iritkatriel commented 1 year ago

Finally, and here I'm going to ask @iritkatriel for help, the except* handlers happen. For some reason it triggers the except* KeyboardInterrupt handler but when that exits it propagates the bare KeyboardInterrupt?!

I stepped through the except* and it's not propagating anything.

Also, I seem to get the same result with this (no except*). Could it be that something in asyncio is caching the exception and reraising it?

import asyncio
async def runner():
    loop = asyncio.get_running_loop()
    fut = loop.create_future()
    async def coro():
        try:
            fut.set_exception(TypeError())
            await loop.create_future()
        except asyncio.CancelledError:
            raise KeyboardInterrupt()
    try:
        async with asyncio.TaskGroup() as tg:
            tg.create_task(coro())
            await fut
    except BaseException as e:
        print('Got BaseException', repr(e))
asyncio.run(runner())
gvanrossum commented 1 year ago

Yeah, that's a good call. It seems to be coming from the asyncio event loop, but I haven't figured out where yet.

gvanrossum commented 1 year ago

The plot thickens. When I replace asyncio.run(runner()) with

asyncio.new_event_loop().run_until_complete(runner())

I get the same traceback followed by the dreaded

Task exception was never retrieved
future: <Task finished name='Task-2' coro=<runner.<locals>.coro() done, defined at /Users/guido/cpython/foo.py:5> exception=KeyboardInterrupt()>

and then repeating the same traceback (omitted because it's the same as reported by the OP).

This means we can rule out a bug in asyncio.run().

Disabling the _asyncio extension, the traceback looks a bit different, and appears to be happening in __step (in 3.11; or in __step_run_and_handle_result in 3.12 and main), in the line

    result = coro.throw(exc)

This is where the CancelledError is being thrown into the coroutine, and KeyboardInterrupt comes out.

What happens then is interesting. There's a try block with several except clauses, and we take this one:

        except (KeyboardInterrupt, SystemExit) as exc:
            super().set_exception(exc)
            raise

This does two things:

The interesting thing, I believe, is that the bare raise breaks straight out of the event loop, which is careful not to catch KeyboardInterrupt. We therefore get this very KeyboardInterrupt back at the top level, where it is reported.

Somehow it never goes through the __aexit__!

When using asyncio.run(), there is exception handling that ends up calling __aexit__ (that's a bit hard to find, I think it goes through Runner.__exit__, which cancels all tasks, which revives our runner() task long enough to call the pending TaskGroup.__aexit__ method and run the except* KeyboardInterrupt block, which "suppresses" the exception. But when Runner.__exit__ returns, the exception is re-raised.

I think from this it should be possible to create a simpler repro (which ought to lead to a fix), but I've spent a couple hours on this already and I need a break.

gvanrossum commented 1 year ago

I'm going to take a break from this issue. It's definitely caused by the special casing of KeyboardError and SystemExit in Task.__step, but I'm having a hard time coming up with a simpler repro.

kyuupichan commented 6 months ago

What is the reason for the special-casing in Task._step?

gvanrossum commented 6 months ago

@kyuupichan I believe it's to ensure that you can always interrupt a runaway loop. It's a historic compromise, we may be able to do better with modern conventions for running event loops.