python-trio / trio-asyncio

a re-implementation of the asyncio mainloop on top of Trio
Other
189 stars 38 forks source link

Improve the documentation: asyncio ⇒ trio migration #11

Open miracle2k opened 6 years ago

miracle2k commented 6 years ago

Even the second time I tried to install trio_asyncio, after getting it running for the first time, I had trouble making it work. I don't think the docs explain it well, and I think what I am missing is really just a full example. I will try to submit a merge request with some improvements, let's see.

Just as an example, the docs at the very beginning say I should write:

import trio
import trio_asyncio

trio_asyncio.run(async_main, *args)

However, for me this leads to an exception: RuntimeError: You cannot replace an event loop.. I need to do trio.run at the root level (and then later the call to open_loop) to make it work.

Not sure if this is a bug in the code, a bug in the documentation, or just something I am misreading.

miracle2k commented 6 years ago

As an example, here is a couple things I tried after reading the docs:

async def async_main(*args):
    async with trio_asyncio.open_loop() as loop:  # <--- RuntimeError: You cannot replace an event loop.
        await asyncio.sleep(1) 

trio_asyncio.run(async_main)
async def async_main(*args):
    async with trio_asyncio.open_loop() as loop:   
        await asyncio.sleep(1)  # <-- TypeError: trio.run received unrecognized yield message <Future pending>

trio_asyncio.run(async_main)
async def async_main(*args):
    async with trio_asyncio.open_loop() as loop:
        await loop.run_asyncio(asyncio.sleep, 1)

trio.run(async_main)

This finally works. I think it is basically the first section Trio main loop that is confusing. If I only saw the section Asyncio main loop I could figure it out.

smurfix commented 6 years ago

Umm, why do you try to open a loop from within trio_asyncio.run()?

Anyway, you're right that this is confusing. I'm fixing that part of the documentation.

njsmith commented 6 years ago

Honestly I think we should remove trio_asyncio.run entirely, it just encourages bad habits...

smurfix commented 6 years ago

@miracle2k Is the current version better? ;-)

@njsmith On the contrary. The good habit is to establish an asyncio context as soon as possible and keep it running. The bad habit (which I've already seen a question about) is why restarting the asyncio loop doesn't work – after all, you can easily write something like

async def trio_main():
    loop = trio_asyncio.open_loop()
    async with loop:
        await some_asyncio_thing()
    … non-asyncio-using code here …
    async with loop:
        await another_asyncio_thing()

(or worse). This kind of code can't be made to work. (I tried.)

njsmith commented 6 years ago

Ah, sure, restarting an event loop is definitely a bad habit (and I agree that trio-asyncio shouldn't try to support it). Even in pure asyncio programs, the new recommendation (in 3.7+) is to have a single call to asyncio.run that sets up the event loop, runs your code, and then closes it again.

However, a core decision in trio is to get rid of having a global event loop, because of all the problems that it can cause, especially with background tasks and callback registrations "leaking out" of their original context. Now, of course trio-asyncio can't be too dogmatic about this; if we want to let people run asyncio code, we have to hold our noses and do things the asyncio way, to some extent. But, we should still try to contain the problems as much as possible, and encourage people to move incrementally towards doing things in the trio way.

So for example, if you're writing a trio program and decide that for some particular operation you really need to use an asyncio library. You can write:

async def frob_the_whatsit():
    async with open_loop():
        response1 = await run_asyncio(aio_function, ...)
        response2 = await run_asyncio(aio_function, ...)
        return ...

Now it's possible that aio_function will accidentally leave behind some background tasks running in the loop. However, even if it does, the damage is contained to just the frob_the_whatsit function: when that function returns, they'll automatically be cleaned up. The rest of the program continues to follow trio's normal rules.

Of course, it's also perfectly reasonable – especially when trying to transition from asyncio to trio – to create a single asyncio loop at the top of your program and use it throughout:

async def trio_main():
    async with open_loop():
        await run_asyncio(asyncio_main)

trio.run(trio_main)

But ideally you'll eventually move from this style, to the style I showed above where the use of asyncio is contained within particular parts of the task tree, until eventually you remove asyncio entirely, and at each step your program will become more "trionic" and you'll get the benefits of that.

So the reason I say trio_asyncio.run encourages bad habits, is that it encourages you to use a single global loop and to stick with it. The explicit async with open_loop() style is a little bit more verbose in the case where you do need an global loop, but I think we want to encourage people to move away from that, and by making the async with open_loop() explicit then it makes it easy for people to see that they could move that around. Removing trio_asyncio.run would also mean that applications and libraries would use trio-asyncio in the same way, which always makes things easier to understand and document.

...I just realized that my last comment on all three open issues (https://github.com/python-trio/trio-asyncio/issues/9#issuecomment-374509091, https://github.com/python-trio/trio-asyncio/issues/13#issuecomment-376368823, this one) are basically all about the same thing, so I guess there might be some deeper disagreement or misunderstanding here. Does this make any sense? Would it help to talk about it on chat/voice at some point to try to get more onto the same page?

smurfix commented 6 years ago

In principle I'm with you, and I agree that this should be the end result. However, another core tenet of Trio is to not require stupid loop= arguments which get carried around everywhere.

Nested or parallel asyncio loops are not a problem if you either do the loop=loop dance or if you have support for contextvars. I do not want to encourage people to do the former and we don't yet have the latter, so the "global loop" thing is the only viable option. trio_asyncio.run makes sense in that context.

As soon as we do support contextvars I plan to add testcases for parallel and nested trio+asyncio loops. Then I'll write a "how to incrementally convert your asyncio program to Trio" tutorial which starts with that global loop, proceeds to propagate Trio context into the program, adds localized loops to encapsulate access to those asyncio libraries or contexts you want or need to keep around, and then replaces trio_asyncio.run with trio.run. At that point I can reasonably add a deprecation warning to trio_asyncio.run.

njsmith commented 6 years ago

So I'm confused, because over in #9 you say that you're already using TaskLocal – which is exactly like contextvars – and that you want to switch away from it?

smurfix commented 6 years ago

Nah, I was confused, about TaskLocal. I seem to have had a brain fart or whatever.

Anyway I'll start with monkeypatching _get_event_loop(), then add a couple of test cases.

njsmith commented 6 years ago

Ah-hah! I'm glad we were able to figure it out then :-) There's a lot of moving parts here...

smurfix commented 6 years ago

Commit f865772 should do the "parallel and/or nesting loops" part.

Onwards to improving the documentation …