python-trio / trio-asyncio

a re-implementation of the asyncio mainloop on top of Trio
Other
189 stars 38 forks source link

API: highlevel strategy discussion #42

Open njsmith opened 6 years ago

njsmith commented 6 years ago

I feel like trio-asyncio's core functionality is been pretty solid for a while, but we're still flailing a bit trying to find the most usable way to expose it to users. I guess right now it has 3 different APIs, we're discussing a 4th, and it's not clear whether any of them is actually what we want? And I'm frustrated that I don't feel like I understand what the pitfalls are or what features users actually need. And my main coping strategy for that kind of frustration is to open an issue and write a bunch of text to organize my (our?) thoughts, so here we go. Hopefully laying out all the information in one place will give us a clearer picture of what the trade-offs actually are, and help us find the best way forward. (4 overlapping APIs and constant churn cannot be the solution!)

What I think I know

Strategies we've tried:

Some tentative conclusions?

So....... maybe I've argued myself into thinking that async with aio_mode / async with trio_mode really are something we have to dig into more, and might even be The Solution To All Our Problems? There are still a bunch of details to work out first to figure out how these can work, but maybe we should do that.

They even make a reasonable substitute for @aio2trio/@trio2aio, e.g. instead of

@aio2trio
async def this_is_called_by_homeassistant():
    # And I write Trio code inside it

You write

async def this_is_called_by_homeassistant():
    async with trio_mode:
        # And I write Trio code inside it

One extra level of indentation, but the same number of lines, and no need for anything really annoying like writing a trampoline function.

What am I missing?

Probably a bunch of stuff, but hopefully laying out my thinking will make it obvious to someone what terrible mistakes I'm making? Please help me be smarter :-)

njsmith commented 6 years ago

A few more questions that didn't fit into the above:

pquentin commented 6 years ago

Thanks for the write up!

I don't know what you're missing and never used trio-asyncio, but if your premises are correct, I do agree with your conclusions: async with aio_mode / async with trio_mode should be the primary API and run_asyncio / run_future / run_trio should be put in a hazmat/core module.

Fuyukai commented 6 years ago

In my opinion, the API should be simply the async with ... style, maybe with the allow_asyncio helper included optionally.

The basic primitives should be exposed, but like, not the way to do anything unless you need them. This would be, in my opinion, the simplest and easiest way to do everything.

smurfix commented 6 years ago

@njsmith You're right in that aio_as_trio etc. are a bit cumbersome (though I got quite used to it). The problem I have with run_asyncio etc. (and I've gotten some feedback saying that I'm not the only one) is that it's not quite as obvious whether the wrapped code is asyncio code, or the caller is asyncio code …

The async with trio_mode: idea would be cool to pull off / have as the main interface, but I'm afraid that somebody else will have to implement it – I've been laid up for a month+ and the Real Work I've accumulated feels like it'll take a couple of lifetimes to clean up. :-/

njsmith commented 6 years ago

I've been laid up for a month+ and the Real Work I've accumulated feels like it'll take a couple of lifetimes to clean up. :-/

@smurfix Oof, I hear you on that :-(. Hope you're feeling better, and good luck.

njsmith commented 6 years ago

It looks like the three primitives we need to implement coroutine runner switching are:

On the Trio side, we control everything, so these are all pretty straightforward to implement. On the asyncio side I think we can implement these as:

A first attempt would be to (a) yield a Future that will never be resumed, (b) call asyncio.tasks._unregister_task. This is a bit gross (uses internal APIs), and has a flaw:

https://github.com/python/cpython/blob/fc439d20de32b0ebccca79a96e31f83b85ec4eaf/Lib/asyncio/tasks.py#L137-L146

self._state == futures._PENDING will be true whenever the task hasn't completed, so in general this logic might fire after we abandon a task to the abyss. If we only had to care about the Python implementation of Task shown above, this wouldn't be an issue, because we could just mutate task._state ourselves to pretend that it had finished. But in the C implementation that's used by default, _state is immutable. (Try it: f = asyncio.Future(); f._state = "CANCELLED" → exception.) We could mutate it with ctypes, but that's pretty ugly, so let's see if there are any other options. I looked for methods that mutate _state, and found Future.cancel. (Note: not Task.cancel. Task inherits from Future, but overrides the cancel method to do something completely different, so normally Future.cancel would never run on a Task object. But, Python allows us to do it explicitly if we want, by writing Future.cancel(task_obj).) This could work, but it also schedules all registered callbacks to run, which is... maybe not what we want? Or is it? If we're killing a task like this, maybe we should run any registered callbacks? For trio-asyncio's purposes, where the task represents a single async with aio_mode embedded inside a larger function, it would be pretty perverse to have any callbacks registered for the task. You'd have to do something like asyncio.current_task().add_done_callback(...), which makes no sense. Anyway, those seem to be the options for manipulating task._state.

Alternatively we could try mutating task._log_destroy_pending. If that's False, then it defangs the Task.__del__ code shown above. And... oh nice, it looks like _log_destroy_pending actually is mutable from the Python level. I guess because asyncio.gather mutates it for some reason.

We should also check Future.__del__, since Task.__del__ invokes it:

https://github.com/python/cpython/blob/fc439d20de32b0ebccca79a96e31f83b85ec4eaf/Lib/asyncio/futures.py#L90-L104

So we might want to make sure __log_traceback is set to False. Fortunately it looks like we can do that by simply writing task._log_traceback = False.

These options are all pretty terrible; if we go ahead with this we should talk to the asyncio folks about making sure that in 3.8 there's a nicer API for this, or at least that our terrible hacks don't break, to avoid a repeat of #23. BUT, it does look like there aren't any show-stoppers currently, so we should probably prototype this out and decide whether we're actually going to commit to it before we talk to the asyncio folks about it.

njsmith commented 5 years ago

You know what else might help makes things less confusing?

We should make it so that if you try to enter trio-mode when you're already in trio-mode, or try to enter aio-mode when you're already in aio-mode, then that's fine and just a no-op.

That switches the mindset from "mark the places where you transition", to a mindset of "mark what you need in each place". And in particular if you write something like

@trio_mode
async def blah(...):
    ...

then blah is now ambidextrous: you can call it freely from either aio-mode or from trio-mode.

(It's actually possible this is already how it works? But the aio_as_trio names certainly make it sound like they require one mode on one side and the other mode on the other side.)

smurfix commented 5 years ago

Yeah, I had that idea originally, but postponed it – as it turns out it's 3.7+ only, because it needs context management from asyncio.

njsmith commented 5 years ago

But, if we're using our implementation of the loop, surely we can make sure to set mode = "aio" in the context whenever we're invoking an aio callback? (In fact we don't even need to... we can just set it once in the management task that the loop uses to invoke callbacks.)

smurfix commented 5 years ago

Tried that. Ran into a couple of interesting cases where this is not as easy as it seems. But I'll re-check soon(ish).

graingert commented 2 years ago

I made this https://gist.github.com/graingert/d20fdaa41511c4cccb756259ee477444#file-trio_in_asyncio-py-L149-L158 does that help?

graingert commented 2 years ago

Kill a task permanently, as if the coroutine object had completed (with some value or exception), even though it hasn't. Used when exiting a task that we don't plan to re-use (e.g., when exiting an async with aio_mode block, we need to free the asyncio backing task that we created, because it will never run again).

To do this you only need to raise StopIteration out of the Coroutine you pass to asyncio.create_task