groove-x / trio-util

Utility library for the Python Trio async/await framework
https://trio-util.readthedocs.io/
MIT License
68 stars 6 forks source link

Enhancement request: Task class #20

Closed arthur-tacca closed 2 years ago

arthur-tacca commented 2 years ago

Update

I have now put together my idea of a task-like class into its own package: aioresult. It includes a ResultCapture class that runs a function and stores the result (like the Task class below) and also a Future class for when you want to manually set the value. It has functions for waiting (though normally just using a nursery would do); the wait_any() and wait_all() are slightly different from those in trio-util, but trio-util ones would also work if you passed ResultCapture.run() to them.

Original post

It would be great to have a Trio Task class, a bit like asyncio.Task or any number of other framework's task classes.

This could be a core part of Trio itself and I think I've seen it requested before, but I've realised it could actually be a fairly simple external class so could be ideal for trio-util.

The key idea is that, in this class, the task gets run in a user-supplied nursery. There are a couple of ways to do this but I think the most natural is forcing the user to run the task separately from creating it, with an async run() method that actually runs the task and you can just pass straight to Nursery.start_soon().

Here's a simple example that shows how you'd use this hypothetical Task object to run a few coroutines in a nursery, pretty similarly to usual, but then inspect their results later:

tasks = {url: Task(my_async_fetch_fn, url) for url in url_list}
async with trio.open_nursery() as nursery:
    for t in tasks.values():
        nursery.start_soon(t.run)
for url, t in tasks.items()
    print(f"At {url} got: {t.result}")

This would satisfy a few common requests made for Trio, most notably for an equivalent of asyncio's gather() function or for Nursery.start_soon() to give a way to get the return value of the async function. Actually, this is better than gather() because you're not forced to use a list, as you can see from the example above. It also nicely complements the wait_any() and wait_all() functions in trio-util (although it still would mainly be used with nurseries as in the above example).

Here's a really minimal implementation (it wouldn't handle enough cases for real use but does illustrate the idea):

class Task:
    def __init__(self, routine, *args):
        self.routine = routine
        self.args = args
        self.result = None
    async def run():
        self.result = await routine(*args)

Extra features

As I said, the above implementation is absolutely minimal. There are quite a few extra features I think could be useful:

Here a couple more for completeness, even though personally I don't like them:

Relevant past issues

In trio-util:

In trio:

In other libraries:

Elsewhere:

arthur-tacca commented 2 years ago

I've put together an implementation:

https://gist.github.com/arthur-tacca/32c9b5fa81294850cabc890f4a898a4e

I've renamed it ResultCapture based on feedback in Trio issues, which I think nicely stresses that it's about getting the coroutine result rather complex machinery for interdependent tasks.

Is there any interest here? Would it be worth me putting together a pull request?

Edit: Now in its own library: https://github.com/arthur-tacca/aioresult

belm0 commented 2 years ago

Hi-- sorry, I had mistakenly dropped taking a look at this from my TODO's.

Our application has pushed Trio fairly hard for 3 years (now 100k lines of code), and I haven't come across this kind of case enough to encapsulate it.

I was focusing on your original use case a little:

The reason I wanted something like this in the first place was so I could wait for completion of a task being spawned in a different nursery (I don't even want the result!) – see this gitter thread.

task = handler_nursery.start_soon(myhandler)
await task

I think it could be covered by just a utility function and the task_status prototcol:

async def done_wrapper(f, *args, *, task_status):
    event = trio.Event()
    task_status.started(event)
    try:
        await f(*args)
    finally:
        event.set()

done = await handler_nursery.start(done_wrapper, myhandler)
await done.wait()
arthur-tacca commented 2 years ago

Thanks for looking at this @belm0

I was focusing on your original use case a little: ...

You're totally right that in my original use case I don't need 90% of what I'm suggesting. A wrapper around the handler that sets an Event is all that's needed. As you can see from that gitter thread, currently I'm just doing that as part of the wider function that uses the handler, but it felt a bit messy to mix that up with its core logic. (In truth, there's so little code that doing any refactoring at all has debatable value.)

Thanks for your done_wrapper() idea, I like it a lot. It avoids dumping all my code into one function, while being laser focused on actually solving my problem, rather than coming up with some super general API. Using the task_status protocol is a really clever way of achieving it.

There's certainly some interest in a general task / result capture class (as all the links in my post show, and actually it came up again today on another Trio gitter thread). But it's clear you're not interested in it in your library, which totally fair enough, especially since there's a lot of debate about what the design would be (also clear from that thread). The gist I posted earlier exists if anyone wants to use it, and I might try to publish my code as a standalone package on PyPI (if I magically find some free time). So I'll close this issue.