python / cpython

The Python programming language
https://www.python.org
Other
62.26k stars 29.91k forks source link

ThreadPoolExecutor loses exceptions. #120508

Open tvvister opened 2 months ago

tvvister commented 2 months ago

Bug report

Bug description:

This code just loses an exception. Without any notification.

from concurrent.futures import ThreadPoolExecutor

def change_data(x: int) -> None:
    res = 1.0/(x - 5)
    print(f'{res}')

if __name__ == "__main__":
    with ThreadPoolExecutor() as pool:
        futs = [
            pool.submit(change_data, x) for x in range(0, 10)
        ]

    print('finished successfully')

The reason is in ThreadPoolExecutor's exceptions swallowing. Proposals are:

  1. log a critical msg if something simular happens
  2. make the default behavior for ThreadPoolExecutor with at least the first exception raise.

The problem maybe seems too exaggerated, but if you have a some year old project with a bunch of places like this, it would mean probably you have a lot of hidden errors and need to analize all the code and fix with try ... except inside task or calling result() for every single future, which you got as a returned value of ThreadPoolExecutor.submit. It sounds old old fashioned and smells like a bad design.

By the way, if we could at least catch all these kind of unhandled exceptions, it would be a good solution. Unfortunatly, neither sys.excepthook nor threading.excepthook catch anything interesting.

All the same is valid for ProcessPoolExecuter as well

CPython versions tested on:

3.11

Operating systems tested on:

Linux

serhiy-storchaka commented 2 months ago

This is an expected behavior. They are not swallowed. They are saved and available to you if you call future.result() or future.exception(). If you want to log errors even if they are ignored by your code, you should wrap the body of your tasks with try/except. You can do this with a wrapper function or a decorator.

Changing this is a breaking change. Logging every raised exception will break programs which correctly handle them.

tvvister commented 2 months ago

I suggest to log only raised exceptions, which are not handled at all. I guess it means that failed future (or task) not await-ed and neither method exception nor result were called for corresponding future, awaitable or task (maybe something else like this).

serhiy-storchaka commented 2 months ago

This would be devastating too. You can stop at the first error and do not check other futures. Ignoring some exceptions can be normal behavior, for example when you you finish the work by closing a file and leaving tasks to die silently from OSError.

Write a simple wrapper:

def print_error(func):
    def wrapper(*args, **kwargs):
        try:
            func(*args, **kwargs)
        except BaseException as e:
            print(f'Error in {e!r} in {func}')
            raise

and wrap every function passed to pool.submit().

tvvister commented 2 months ago

Sure, I will use the idea, you express in the previous msg.

But I cannot still accept internally the design. I assume it will take a little bit more time.🙂 I get used to the design which applied in .net. Maybe this is the reason of my slowness. In .net you can use one global handler for every exception in a separate thread.

Here is much more detailed:

https://stackoverflow.com/questions/32261014/catching-unhandled-exceptions-with-asp-net-webapi-and-tpl