python-trio / trio-asyncio

a re-implementation of the asyncio mainloop on top of Trio
Other
189 stars 38 forks source link

Can we not call asyncio.set_event_loop_policy on import? #13

Closed miracle2k closed 6 years ago

miracle2k commented 6 years ago

I am using Trio and trio-asyncio as part of module structure, where some modules expect to be run in pure asyncio. Imagine we start a pure asyncio-thread, and a trio-thread, separately. They are like agents that can optionally run on separate hosts and communicate over the network.

Any global import of trio_asycio messes with the asyncio-only parts, even due trio is not really in use there. I then basically have to always import trio_asycio in a local function to make sure I don't change the event loop of the main thread.

smurfix commented 6 years ago

There is a problem here in that the asyncio event loop policy is global, not per-thread.

If we keep the default policy (i.e. use it with a trio-asyncio loop), at minimum we'd need a couple of test cases which demonstrate that this doesn't introduce any bugs.

miracle2k commented 6 years ago

I debugged this further. I have an error with the following code specifically:

import asyncio
import uvloop
import trio_asyncio

async def main():
    asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
    await asyncio.sleep(1)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Now I am thinking that I messed this up already by myself, since I am trying to set an event loop policy while already having an active event loop, which probably makes this ineffectual.

Still, the fact that trio_asyncio implicitly changes the event loop on import seems irksome if I want to control it myself instead, I never know when an import may be triggered to change the policy.

Not sure what the right way to deal with this is from an asyncio perspective. Maybe I should just talk to my preferred policy directly when creating a loop?

miracle2k commented 6 years ago

The automatic import still causes me problems, even if I ignore any custom event policy I want to install. Assuming I just want to run a regular asyncio app, but I do have an (unused) trio_asyncio import anywhere in any modules, for example using one of the decorators):

import asyncio
import trio_asyncio
from concurrent.futures import ThreadPoolExecutor

async def main():
    executor = ThreadPoolExecutor(max_workers=1)
    await asyncio.get_event_loop().run_in_executor(executor, lambda: None)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())
  File "/Users/michael/.local/share/virtualenvs/foobar/lib/python3.6/site-packages/trio_asyncio/base.py", line 408, in run_in_executor
    assert isinstance(executor, TrioExecutor)
AssertionError
smurfix commented 6 years ago

You should not change the loop policy while asyncio is running, that's pretty much independent of trio-asyncio.

The event loop policy is global. That's a design limitation of asyncio. Multiple policies are not supported; if that works anyway, you're lucky. Unfortunately, it seems that if you're adding trio_asyncio to the mix, your luck runs out.

asyncio.get_event_loop() is documented to ask the current event loop policy for the current thread's event loop (possibly). This no longer works when you switch policies. Your first code cannot work. At minimum you'll need to pass the current loop into every call to asyncio.whatever(), i.e. you need to call await asyncio.sleep(1, loop=loop).

Trio doesn't need a loop= argument because the active loop is obvious from the call's context. No loop=whatever argument anywhere. I'd like to support this behavior in trio-asyncio, because otherwise you'd need to add a loop= argument to every asyncio-trio-asyncio call chain. I can't do that, for obvious reasons.

As to your ThreadPoolExecutor example: yes, trio_asyncio's event loop requires you to use a TrioExecutor. I remember to have documented this. Unfortunately I don't think there's much I can do about that.

I might need to add a "why does «code pattern» not work with trio_asyncio" section to the documentation.

smurfix commented 6 years ago

That being said, I'll disable the unconditional loop policy change soon, but I'll need to fix a few test cases for that to continue to work.

njsmith commented 6 years ago

cough

"We should make sure that the loop is treated as the "current" event loop for asyncio. In practice, I think this means storing it in a trio.TaskLocal so we can find it again later, and then monkeypatching asyncio._get_running_loop and asyncio.events._get_running_loop [...]" – https://github.com/python-trio/trio-asyncio/issues/2#issue-290270468

"It took me a bit to remember why I had decided we needed to monkey-patch here... I think there are two reasons. [...] The second reason is more real, but only matters in a pretty obscure case. In theory, someone might want to run trio in on thread and, I dunno, uvloop or something in another thread. So ideally, we'd only be taking over the event loop policy in the trio thread, not globally across the process. But there is no way to do that using the usual tools [...] We could potentially wait until someone complains about this to worry about it." – https://github.com/python-trio/trio-asyncio/issues/2#issuecomment-359349845

Looks like someone has complained about it :-). My vote is we drop the policy entirely and switch to monkeypatching – it's gross but in practice I think it will cause less problems.

(Despite what I said in the quoted text above, I'm not sure if it'd be better to monkeypatch _get_running_loop or get_event_loop without looking at the code again more closely. It may not matter either way.)

smurfix commented 6 years ago

I have removed the call that's setting the TrioLoopPolicy. No monkeypatching required. You'll still need to set the policy in compatibility mode, but then you shouldn't use that anyway. :-P

Code is in the merge branch, awaiting Travis checks.

njsmith commented 6 years ago

How do we keep asyncio.get_event_loop() working if we don't set the policy or monkeypatch?

smurfix commented 6 years ago

That's easy. There are two thread-local storages for the current event loop. One is global, accessed via asyncio.event._get_event_loop, and only applies to the currently-running loop. The other is local to the event loop policy and remembers which loop is in use while outside of loop.run_forever().

This means that the first thing any event loop policy is required to do is to check asyncio.event._get_event_loop and return that if it's not None. Thus the trio-asyncio loop implementation simply calls asyncio.event._set_event_loop(self) when it starts, and we're done.

Things will get interesting when we support multiple loops per thread, via contextvars, because then I'll have to look up the actual loop in the current Trio thread, i.e. either monkeypatch or insert a shim loop object. Both have compatibility issues.

njsmith commented 6 years ago

This means that the first thing any event loop policy is required to do is to check asyncio.event._get_event_loop and return that if it's not None

FYI, this is not quite right – asyncio.get_event_loop checks this before it touches the policy: https://github.com/python/cpython/blob/da1734c58d2f97387ccc9676074717d38b044128/Lib/asyncio/events.py#L750-L753

Not that it helps us, we probably need to monkeypatch :-). But I guess it might be useful to know...

smurfix commented 6 years ago

Resolved in commit f865772 by way of monkey-patching asyncio.