python / cpython

The Python programming language
https://www.python.org
Other
63.02k stars 30.17k forks source link

create_task() recommendation seems like bad practice #104091

Open Dreamsorcerer opened 1 year ago

Dreamsorcerer commented 1 year ago

Documentation

My understanding of asyncio is that all tasks etc. should be awaited on. If this is not done, then exceptions in tasks will never be seen and asyncio will give an "exception was not retrieved" warning.

The documentation currently recommends this code for a fire-and-forget task creation:

background_tasks = set()
for i in range(10):
    task = asyncio.create_task(some_coro(param=i))
    background_tasks.add(task)
    task.add_done_callback(background_tasks.discard)

https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task

This seems like bad practice as the tasks are never awaited (another problem might be if this is copied inside a function, won't the set get garbage collected after the function ends?). To my knowledge, there is no trivial way to do fire-and-forget tasks using stdlib and I have always recommended users to try the aiojobs library to handle this. e.g. In the aiohttp documentation: https://docs.aiohttp.org/en/latest/web_advanced.html#web-handler-cancellation (Scroll down to the second warning, and the paragraph preceding it).

I feel that this example should probably be scrapped and possibly a recommendation to use aiojobs for these tasks may be better.

I'd also suggest including the text of both error messages ("exception not retrieved" and "Task was destroyed"), so that people putting that error into a search engine are likely to end up on this page.

carljm commented 1 year ago

I think there's a misunderstanding here about what create_task does. Which is understandable: the name create_task is arguably misleading by omission. It isn't just a pure function that creates and returns a Task instance; it creates a task and schedules it to the event loop. (See the documentation.)

This means the coroutine doesn't need to be explicitly awaited, because the event loop will take care of ensuring that it gets waited on. ~The event loop also keeps a reference to the Task object, which keeps a reference to the coroutine, so nothing will be garbage collected until it's done running.~ (Edit: this is wrong.)

So this documented recommendation for fire-and-forget task creation does work fine, and is not bad practice (except inasmuch as TaskGroups are probably a better option in most cases.)

Dreamsorcerer commented 1 year ago

This means the coroutine doesn't need to be explicitly awaited, because the event loop will take care of ensuring that it gets waited on.

No, it ensures it executes, not that it get awaited on. You can't get an "exception not retrieved" if the code hasn't run to create an exception...

Simple reproducer:

>>> import asyncio
>>> async def foo():
...  raise RuntimeError()
... 
>>> async def main():
...  asyncio.create_task(foo())
...  await asyncio.sleep(1)
... 
>>> asyncio.run(main())
Task exception was never retrieved
 ...

Well written code should never see "Task exception was never retrieved".

So this documented recommendation for fire-and-forget task creation does work fine, and is not bad practice (except inasmuch as TaskGroups are probably a better option in most cases.)

TaskGroups can only be used for this if you create some kind of global TaskGroup for your application that you keep around till the application ends. This can work, but in something like a server application, I think you'd only get the exceptions when the server shuts down, which could be weeks after the exception happened. So, for server applications, it'd still be best to use aiojobs or similar.

carljm commented 1 year ago

Sorry, I did misunderstand your report.

The name "fire-and-forget" specifically suggests that you don't care if it succeeds or fails. If you are going to check the result/exception, then it isn't really fire and forget.

I'm not sure if we should be recommending a specific third-party library here, but it may make sense to add a note in the documentation that fire-and-forget means you won't know if it succeeded or failed, and you may get unretrieved exception warnings in case of failure.

I'll reopen this so asyncio experts can weigh in on what, if anything, should be improved here.

Dreamsorcerer commented 1 year ago

Yeah, I'm not too sure about linking to a third-party library either, I just don't have a better suggestion (although TaskGroup can certainly be recommended in some circumstances).

Good engineering practice is not to ignore all exceptions though, so I'm sceptical that was the intended design, and not just something that was overlooked. Typically, a user wants to run tasks in the background without waiting for it to complete or needing to retrieve the result. That doesn't mean they don't want to know when they've broken their code and it's not running anything...

gvanrossum commented 1 year ago

The coroutine running in the background task could contain a try/except clause that catches all exceptions and logs them. You could easily make a decorator that does this for your application so you don't have to repeat the try/except/log boilerplate in every background task you write.

We're working on additional flexibility for TaskGroup that will make this pattern simpler. (See e.g. https://discuss.python.org/t/revisiting-persistenttaskgroup-with-kotlins-supervisorscope/18417)

Dreamsorcerer commented 1 year ago

We're working on additional flexibility for TaskGroup that will make this pattern simpler. (See e.g. https://discuss.python.org/t/revisiting-persistenttaskgroup-with-kotlins-supervisorscope/18417)

Right, I think if TaskGroup starts supporting logging exceptions as they occur, then it should be able to replace aiojobs.Scheduler.

So, I think we should be recommending a long-lived TaskGroup for fire-and-forget tasks then, instead of the currently documented workaround.

It might also be worth having a TaskGroup.close() (and .start()?) to support this use case too. e.g. In startup/shutdown hooks:

async def on_startup(app):
    app["tg"] = asyncio.TaskGroup()
async def on_shutdown(app):
    await app["tg"].close()

In aiohttp, we have cleanup_ctx, where we'd use async with, but I'm sure there are frameworks without that kind of hook.

(Edit: I see discussion of a .shutdown() in that conversation, so maybe that's already coming).

gvanrossum commented 1 year ago

TaskGroup itself won't be logging exceptions, you will have to write your own exception handler or done-callback for that.

For TaskGroup.start() and .close() you can call __aenter__() and __aexit__() directly (these are public APIs).

I don't see an action item here.

Dreamsorcerer commented 1 year ago

TaskGroup itself won't be logging exceptions, you will have to write your own exception handler or done-callback for that.

But, it will default to asyncio.get_running_loop().call_exception_handler(), right? That's the same as aiojobs, so it should still work as a simple replacement.

For TaskGroup.start() and .close() you can call __aenter__() and __aexit__() directly (these are public APIs).

Fair enough, I think generally people discourage use of calling magic methods directly, so I was thinking users will be more comfortable with a public (I mean, no _ prefix) method. I think that'd also be consistent with other objects too (e.g. file-like objects have .close() in addition to __exit__()). Either way, it'll work though.

gvanrossum commented 1 year ago

~> But, it will default to asyncio.get_running_loop().call_exception_handler(), right? That's the same as aiojobs, so it should still work as a simple replacement.~

~I would like to push back on this a little. Why do you want it to call call_exception_handler()? Looking over the asyncio source code, that function is generally only called when there is no alternative, e.g. when an exception is caught in a finalizer.~

~The regTaskGroup doesn't call it when a task crashes~

[Submitted by mistake, see full comment below.]

gvanrossum commented 1 year ago

Whoops, closed by mistake. Will resume in a bit.

gvanrossum commented 1 year ago

But, it will default to asyncio.get_running_loop().call_exception_handler(), right? That's the same as aiojobs, so it should still work as a simple replacement.

I would like to push back on this a little. Why do you want it to call call_exception_handler()? Looking over the asyncio source code, that function is generally only called when there is no alternative, e.g. when an exception is caught in a finalizer.

The regular TaskGroup doesn't call it when a task fails, so why should the variant for background tasks call it? I still think that there are already plenty of other ways for the user to arrange for task failures to be logged, e.g. by wrapping the task's coroutine in a handler that logs the exception, or by adding a done-callback to do the same. (@achimnol seems to agree, his latest Supervisor design does not do this.)

(Though TBH I'm not 100% sold on my own argument here, so maybe you have a good argument?)

Fair enough, I think generally people discourage use of calling magic methods directly, so I was thinking users will be more comfortable with a public (I mean, no _ prefix) method. I think that'd also be consistent with other objects too (e.g. file-like objects have .close() in addition to __exit__()). Either way, it'll work though.

Technically, a dunder is not "private" or "protected" in the sense that a single or double underscore conveys. For files, close() long predated __exit__(). I'm not too keen on violating TOOWTDI here. But maybe the Supervisor class shouldn't be an async context manager, solving that problem.

Regarding the original doc example, I don't necessarily like making all examples industrial-strength, because (to me) the purpose of an example is understanding, not a copy-pastable solution. (You end up with very verbose examples if you strive for the latter.) I'd be okay with a small update to that particular example though that reminds the reader that if the task fails, no exception will be printed anywhere. The user can then decide what to do about that.

Dreamsorcerer commented 1 year ago

I would like to push back on this a little. Why do you want it to call call_exception_handler()?

It was mentioned at https://discuss.python.org/t/revisiting-persistenttaskgroup-with-kotlins-supervisorscope/18417/12 So, I thought it was already part of the plan.

I was thinking that aiohttp installs it's own default handler, but now that I look I'm not so sure. I'll give a try later, but basically I'd expect to atleast have exceptions appear on stdout if no log handler is provided.

The regular TaskGroup doesn't call it when a task fails, so why should the variant for background tasks call it?

In that case the exception is reraised and can then be handled by other code, right? Whereas a background task could only be reraised when the application ends. So, in this scenario, I'd expect the exception to be handled immediately and not raise anything on shutdown.

I still think that there are already plenty of other ways for the user to arrange for task failures to be logged, e.g. by wrapping the task's coroutine in a handler that logs the exception, or by adding a done-callback to do the same. (@achimnol seems to agree, his latest Supervisor design does not do this.)

(Though TBH I'm not 100% sold on my own argument here, so maybe you have a good argument?)

My argument is really just to do the expected thing by default. i.e. A user who creates this Supervisor without much knowledge to best practices should not then end up spending hours or days trying to figure out why their code is not behaving correctly, simply because they never even knew an exception was happening in their code.

It would also just be convenient, not having to add (and remember to add) a custom decorator or whatever everytime you quickly create a task to throw into the Supervisor.

But maybe the Supervisor class shouldn't be an async context manager, solving that problem.

I'm not too bothered either way, as I say, we have cleanup_ctx in aiohttp, so it'll either look like:

async with Supervisor() as s:
    app["supervisor"] = s
    yield

or

app["supervisor"] = Supervisor()
yield
await app["supervisor"].close()
gvanrossum commented 1 year ago

I would like to push back on this a little. Why do you want it to call call_exception_handler()?

It was mentioned at https://discuss.python.org/t/revisiting-persistenttaskgroup-with-kotlins-supervisorscope/18417/12 So, I thought it was already part of the plan.

I have to apologize, my Discourse mailing list mode was off for a while and I missed that post (and many others).

But, in https://discuss.python.org/t/asyncio-tasks-and-exception-handling-recommended-idioms/23806/11 it seems @achimnol is reversing course; I found no trace of call_exception_handler() in the code linked from there. (Yes, it's called, but only in a branch that the original author of TaskGroup wasn't even should could be reached, according to the comment.)

I was thinking that aiohttp installs it's own default handler, but now that I look I'm not so sure. I'll give a try later, but basically I'd expect to atleast have exceptions appear on stdout if no log handler is provided.

The regular TaskGroup doesn't call it when a task fails, so why should the variant for background tasks call it?

In that case the exception is reraised and can then be handled by other code, right? Whereas a background task could only be reraised when the application ends. So, in this scenario, I'd expect the exception to be handled immediately and not raise anything on shutdown.

I still think that there are already plenty of other ways for the user to arrange for task failures to be logged, e.g. by wrapping the task's coroutine in a handler that logs the exception, or by adding a done-callback to do the same. (@achimnol seems to agree, his latest Supervisor design does not do this.) (Though TBH I'm not 100% sold on my own argument here, so maybe you have a good argument?)

My argument is really just to do the expected thing by default. i.e. A user who creates this Supervisor without much knowledge to best practices should not then end up spending hours or days trying to figure out why their code is not behaving correctly, simply because they never even knew an exception was happening in their code.

It would also just be convenient, not having to add (and remember to add) a custom decorator or whatever everytime you quickly create a task to throw into the Supervisor.

Okay, you have convinced me. I'd like @achimnol's response.

Dreamsorcerer commented 1 year ago

One last point I'd say on the logging front is that for smaller personal projects it's pretty common to not configure logging at all. In that case, it's expected to see exceptions/warnings when you run the application from a terminal or by looking at the web server logs, as they would be sent to stdout/stderr by default.

achimnol commented 1 year ago

I've been busy for closing up my company's seris A funding along with several big customer follow-ups. 😞 Recently, I began to look into this issue again, with some rewriting of the aiotools experimentation to make the key concepts in TaskGroup/Supervisor reusable to write various coroutine aggregation patterns.

https://github.com/achimnol/aiotools/pull/58

I agree with that "a user should not end up spending hours or days trying to figure out why their code is not behaving correctly, simply because they never even knew an exception was happening in their code." because I also experienced this many times. 😭

willingc commented 3 months ago

Moving this issue status back to To Do as I do not believe that anyone is currently documenting this. But it may be helpful for someone interested in Docs to do so.