python / cpython

The Python programming language
https://www.python.org
Other
62.72k stars 30.07k forks source link

Impossible to identify asyncio.TimeoutError from other TimeoutError #124308

Open ArtsiomAntropau opened 1 week ago

ArtsiomAntropau commented 1 week ago

Feature or enhancement

Proposal:

Abstract

Sometimes there might be a case when a block inside async with asyncio.timeout(10) can raise TimeoutError and there are no proper way to identify one TimeoutError from another.

async with asyncio.timeout(10):
    if ...:
        raise TimeoutError

then here engineers will not be able to identify from where TimeoutError was received

Simplest example

import asyncio

async def main():
    try:
        async with asyncio.timeout(1):
            raise TimeoutError
    except asyncio.TimeoutError:  # will catch both, but should only exception from asyncio.timeout
        print('catched asyncio.TimeoutError')

asyncio.run(main())

More complex example

try:
    async with asyncio.timeout(10):
        if ...:
            raise TimeoutError
except TimeoutError:  # will catch both
    pass
except asyncio.TimeoutError:  # will catch both, but should only exception from asyncio.timeout
    pass

Proposal

asyncio should raise asyncio.TimeoutError from asyncio.exceptions and use some kind of inheritance, e.g.

class TimeoutError(builtins.TimeoutError):
    pass

then engineers will be able to use native TimeoutError from builtins as well as asyncio.TimeoutError for their try/except or isinstance checks.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

rruuaanng commented 1 week ago

Feature or enhancement

Proposal:

Abstract

Sometimes there might be a case when a block inside async with asyncio.timeout(10) can raise TimeoutError and there are no proper way to identify one TimeoutError from another.

async with asyncio.timeout(10):
    if ...:
        raise TimeoutError

then here engineers will not be able to identify from where TimeoutError was received

Simplest example

import asyncio

async def main():
    try:
        async with asyncio.timeout(1):
            raise TimeoutError
    except asyncio.TimeoutError:  # will catch both, but should only exception from asyncio.timeout
        print('catched asyncio.TimeoutError')

asyncio.run(main())

More complex example

try:
    async with asyncio.timeout(10):
        if ...:
            raise TimeoutError
except TimeoutError:  # will catch both
    pass
except asyncio.TimeoutError:  # will catch both, but should only exception from asyncio.timeout
    pass

Proposal

asyncio should raise asyncio.TimeoutError from asyncio.exceptions and use some kind of inheritance, e.g.

class TimeoutError(builtins.TimeoutError):
    pass

then engineers will be able to use native TimeoutError from builtins as well as asyncio.TimeoutError for their try/except or isinstance checks.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

The asyncio.TimeoutError you mentioned inherits exception, and TimeoutError inherits OSError. Maybe you can refer to this code

import asyncio

try:
    # raise asyncio.TimeoutError
    raise TimeoutError
except (asyncio.TimeoutError, OSError) as e:
    if isinstance(e, asyncio.TimeoutError):
        print('asyncio.TimeoutError')
    elif isinstance(e, OSError):
        print('TimeoutError')

It can distinguish asyncio.TimeoutError from the built-in TimeoutError.

ZeroIntensity commented 1 week ago

@rruuaanng, that workaround is wrong :(

Running that on my end results in:

>>> import asyncio
>>> 
>>> try:
...     # raise asyncio.TimeoutError
...     raise TimeoutError
... except (asyncio.TimeoutError, OSError) as e:
...     if isinstance(e, asyncio.TimeoutError):
...         print('asyncio.TimeoutError')
...     elif isinstance(e, OSError):
...         print('TimeoutError')
... 
asyncio.TimeoutError

This is because asyncio.TimeoutError is exactly the same as TimeoutError (quite literally asyncio.TimeoutError is TimeoutError is true), there's no way to distinguish between the two in an except clause. As far as I can see, this was originally intentional. See this issue (cc @kumaraditya303).

The only workaround that I could think of would be to check if the traceback frame originated from asyncio, but that's less than ideal. So, I think this is a reasonable feature; there's no good workaround, and the implementation is tiny (and has basically no maintenance cost).

rruuaanng commented 1 week ago

@rruuaanng, that workaround is wrong :(

Running that on my end results in:

>>> import asyncio
>>> 
>>> try:
...     # raise asyncio.TimeoutError
...     raise TimeoutError
... except (asyncio.TimeoutError, OSError) as e:
...     if isinstance(e, asyncio.TimeoutError):
...         print('asyncio.TimeoutError')
...     elif isinstance(e, OSError):
...         print('TimeoutError')
... 
asyncio.TimeoutError

This is because asyncio.TimeoutError is exactly the same as TimeoutError (quite literally asyncio.TimeoutError is TimeoutError is true), there's no way to distinguish between the two in an except clause. As far as I can see, this was originally intentional. See this issue (cc @kumaraditya303).

The only workaround that I could think of would be to check if the traceback frame originated from asyncio, but that's less than ideal. So, I think this is a reasonable feature; there's no good workaround, and the implementation is tiny (and has basically no maintenance cost).

Hmm, Maybe I wrote it wrong. I thought maybe this was for compatibility reasons, since the behavior of TimeError is the same, there is really no difference.

ArtsiomAntropau commented 1 week ago

Previously, we’ve had

class TimeoutError(Exception):
    pass

so thats understandable why it was bad (no any connections with builtins.TimeoutError), but the way how it was refactored (TimeoutError = builtins.TimeoutError) caused another problem asyncio.TimeoutError is builtins.TimeoutError which causes inability to support TimeoutError inside async with asyncio.timeout(): block if we need differentiate them.

I’ve caught that in our production application and that really got me into a dead end. For sure option with checking context/traceback in that case … is not how engineers want to work with that.

rruuaanng commented 1 week ago

Previously, we’ve had

class TimeoutError(Exception):
    pass

so thats understandable why it was bad (no any connections with builtins.TimeoutError), but the way how it was refactored (TimeoutError = builtins.TimeoutError) caused another problem asyncio.TimeoutError is builtins.TimeoutError which causes inability to support TimeoutError inside async with asyncio.timeout(): block if we need differentiate them.

I’ve caught that in our production application and that really got me into a dead end. For sure option with checking context/traceback in that case … is not how engineers want to work with that.

Perhaps you can modify the interpreter exception to meet your application needs instead of using the official compiler implementation.

kumaraditya303 commented 1 week ago

FWIW, it was a deliberate change not just for asyncio but other modules too, you can create your own custom exception for your application

ArtsiomAntropau commented 1 week ago

@kumaraditya303 if here (in asyncio) we will have inherited TimeoutError, then engineers will be able to handle both builtins.TimeoutError or any custom timeout error inside the block as well as asyncio.TimeoutError.

Code inside this block might use other third-party libs that can raise both their custom exceptions or builtins.TimeoutError, so exactly asyncio is responsible for this “split” and only asyncio prevents us here.

We should not and can not restrict usage of some exceptions inside asyncio.timeout block imho …

ZeroIntensity commented 1 week ago

TimeoutError is technically deprecated, and has been since 3.11. Unfortunately, I think we've done a pretty bad job of actually enforcing that deprecation. A quick GitHub search shows a bit over 60k people still using asyncio.TimeoutError. Similarly, typeshed does not use @deprecated on TimeoutError, so I think removing it would be the wrong approach. (Overall, I've noticed that we don't have a good way to emit deprecation warnings for exceptions, but that's something for the future.) So, if we want to do anything here, we need to un-deprecate asyncio.TimeoutError. Same goes for concurrent.futures.TimeoutError.

While we're here, aliasing to the global TimeoutError was not done for multiprocessing, so bpo-42413 is still a problem there (issubclass(multiprocessing.TimeoutError, TimeoutError) is still False on the newer versions).

I'll put together a PR addressing each of these. I think there are certainly some use cases where being able to differentiate between a timeout error raised by asyncio or something else is useful (especially considering we already have the existing asyncio.TimeoutError).

ZeroIntensity commented 1 week ago

I've created #124320 to make standard library TimeoutErrors unique.

gvanrossum commented 1 week ago

Honestly this was designed like this on purpose. I’m not sure I see the use case. I might be convinced that asyncio should add the Timeout instance as an attribute to the exception though?

ZeroIntensity commented 1 week ago

I think the "optimal" solution here would have been to remove TimeoutError from asyncio altogether, but that would break too much to be feasible.

ZeroIntensity commented 1 week ago

Ah, so there's a much bigger problem at hand with implementing asyncio.TimeoutError as such:

class TimeoutError(TimeoutError):
    pass

As shown in my PR, changing this for concurrent.futures and asyncio breaks a number of tests, because many tests are erroneously relying on things like asyncio.TimeoutError to catch not just asyncio timeout errors, but ones that occur in things that asyncio uses, namely concurrent.futures. In the asyncio and multiprocessing test suite, concurrent.futures.TimeoutError is raised rather than asyncio.TimeoutError, so all the tests that try and catch that get broken -- I have no doubt in my mind that users are doing this too. Example:

loop = asyncio.get_running_loop()

try:
    asyncio.run_coroutine_threadsafe(asyncio.sleep(5), loop).result(timeout=1)
except asyncio.TimeoutError:  # Would now raise a concurrent.futures.TimeoutError, so this gets broken
    print("Timed out!")

The above would get broken and raise a concurrent.futures.TimeoutError if we were to make them separate. So, making the two different is a breaking change. I guess the only thing we can do is try and document the deprecation better, but that will require every test to get migrated over to TimeoutError rather than one from a specific module.

What do you suggest we do at this point, @gvanrossum and @kumaraditya303?

ArtsiomAntropau commented 1 week ago

@ZeroIntensity (I'm really not very good at navigating cpython source code, but) what if we will cover not all places, but limited amount of them ... internally, it can continue use TimeoutError, but externally what if

go from

            if self._task.uncancel() <= self._cancelling and exc_type is exceptions.CancelledError:
                # Since there are no new cancel requests, we're
                # handling this.
                raise TimeoutError from exc_val

to

            if self._task.uncancel() <= self._cancelling and exc_type is exceptions.CancelledError:
                # Since there are no new cancel requests, we're
                # handling this.
                raise exceptions.TimeoutError from exc_val
ZeroIntensity commented 1 week ago

You can take a look at my closed PR for how I implemented it. If we only switch it to raise the module-specific TimeoutError in some places, then that's even more of a bug: a user catching asyncio.TimeoutError won't get fulfilled in the cases where we raise TimeoutError.

It looks like frame introspection is your only shot here, sorry :(

ArtsiomAntropau commented 1 week ago

I'm seeing some workaround but when you'll see it ...

import asyncio

class _TimeoutError(Exception):
    pass

async def some_code():
    raise TimeoutError

async def main():
    try:
        async with asyncio.timeout(1):
            try:
                await some_code()
            except TimeoutError as exc:
                print('catched TimeoutError, but can not propagate it, so wrapping into custom _TimeoutError')
                raise _TimeoutError from exc
    except asyncio.TimeoutError:
        print('catched asyncio.TimeoutError')
    except _TimeoutError as exc:
        print('catched _TimeoutError, reraising it as general TimeoutError')
        raise TimeoutError from exc

asyncio.run(main())
gvanrossum commented 1 week ago

I think the "optimal" solution here would have been to remove TimeoutError from asyncio altogether, but that would break too much to be feasible.

Why do that? It’s a one line alias and avoids breaking code.

ZeroIntensity commented 1 week ago

I think the "optimal" solution here would have been to remove TimeoutError from asyncio altogether, but that would break too much to be feasible.

Why do that? It’s a one line alias and avoids breaking code.

Because we get issues like this :) It's also a bit of a gotcha for users, it isn't clear that asyncio.TimeoutError catches TimeoutError as well.

@ArtsiomAntropau, that workaround seems incorrect. That would still catch a TimeoutError raised by someone else. As I mentioned, you should check the traceback to figure out where it actually originated from. Here's what I came up with:

import asyncio
from pathlib import Path

ASYNCIO_PATH = Path(asyncio.__file__).parent

def is_error_from_asyncio(error: BaseException) -> bool:
    traceback = error.__traceback__
    if traceback is None:
        return False

    traceback = traceback.tb_next
    if traceback is None:
        return False

    frame = traceback.tb_frame
    if frame.f_back is None:
        return False

    frame_parent_directory = Path(frame.f_code.co_filename).parent
    return frame_parent_directory == ASYNCIO_PATH
ZeroIntensity commented 1 week ago

Unless Kumar or Guido come up with a better solution here, this is a wontfix. I guess I'll submit a PR to typeshed later to add the missing @deprecated to asyncio.TimeoutError.

gvanrossum commented 1 week ago

Unless Kumar or Guido come up with a better solution here, this is a wontfix. I guess I'll submit a PR to typeshed later to add the missing @deprecated to asyncio.TimeoutError.

Sadly I don’t think that works for an alias, and we can’t make it a subclass (it’s documented as an alias).

gvanrossum commented 1 week ago

Sorry.

ZeroIntensity commented 1 week ago

Sadly I don’t think that works for an alias, and we can’t make it a subclass (it’s documented as an alias).

Oh, right. Sorry for all the hassle here!

graingert commented 4 days ago

You can create another context manager like this:

@contextlib.asynccontextmanager
async def move_on_after(timeout):
    timed_out = False
    try:
        async with timeout:
            try:
                yield
            except TimeoutError:
                timed_out = True
                raise
    except TimeoutError as e:
        if timed_out:
            raise

Then you can do:

async with move_on_after(asyncio.timeout(5)):
    ...
graingert commented 4 days ago

I'd like to see an exc flag to asyncio.timeout so you can do:

async with asyncio.timeout(1, exc=SystemFailed("timeout occurred")):
    await etc()

Or

async with asyncio.timeout(1, exc=None):
    await etc()