agronholm / anyio

High level asynchronous concurrency and networking framework that works on top of either trio or asyncio
MIT License
1.81k stars 138 forks source link

Thread race condition in _eventloop.get_asynclib() (with fix) #425

Closed danielrobbins closed 3 months ago

danielrobbins commented 2 years ago

I have a threading Web spider using httpx that is triggering a race condition in anyio/_core/_eventloop.py's get_asynclib() function. This race can be triggered if you have a lot of threads trying to initialize their own asyncio event loop and initializing httpx all at once. It happens intermittently -- sometimes we don't trigger the race, sometimes we do. When we do, the threads catch get_asynclib() mid-initialization and certain attributes that should be present are not available. I am using anyio 3.5.0, httpx 0.22.0 and python3.7.

httpx-exception.txt

An ugly but functioning patch which I can confirm resolves the issue is below:

get_asynclib_thread_fix.patch.txt

In my test patch, I made the lock cover essentially the entire get_asynclib() call -- it could potentially be made tighter, with questionable benefit -- not sure.

agronholm commented 2 years ago

I'm very hesitant to add locking to get_asynclib() which is "hot" code. I'll take a look at this and see if I can find a better solution.

agronholm commented 2 years ago

I do question the use case though – why would you run multiple event loops?

danielrobbins commented 2 years ago

If you are using threads, each thread needs its own event loop. So if you ever use ThreadPoolExecutor, and you want to use async inside, each one needs its own event loop. That is the use case in my code. In my case, each thread is performing network operations so to use httpx, I require an event loop to exist for the lifetime of the thread.

laker-93 commented 1 year ago

I wonder if this problem would still occur if you were to use ProcessPoolExecutor with 'spawn' context, rather than threads. That way each event loop would be running in a separate memory space and your event loops would be isolated. Therefore, there would be no chance of threads racing against each other due to shared state. I think separate processes is the correct solution here in any case, to give you true parallelism.

graingert commented 1 year ago

Rather than use sys.modules you can import into a global and return that

DavidJiricek commented 7 months ago

This issue is still present in function get_async_backend. Error message is

partially initialized module 'anyio._backends._asyncio' has no attribute 'backend_class' (most likely due to a circular import)

Steps to reproduce the error: 1) Add time.sleep(5) to https://github.com/agronholm/anyio/blob/master/src/anyio/_backends/_asyncio.py#L2481 2) Run following script:

from concurrent.futures import ThreadPoolExecutor
from anyio._core._eventloop import get_async_backend
def w():
    return get_async_backend('asyncio')
import time
with ThreadPoolExecutor(max_workers=2) as exec:
    f1 = exec.submit(w)
    time.sleep(1)
    f2 = exec.submit(w)
print(f2.result())  # exception

I made a PR to fix it. https://github.com/agronholm/anyio/pull/714

gpsa commented 3 months ago

Would that be possible to have this same fix added to anyio 3.x?

agronholm commented 3 months ago

I don't plan on updating v3.x anymore. Why are you still using it?

gpsa commented 3 months ago

We received a snyk alert about it and upon trying to upgrade it, I noticed both FasAPI and httpx depend on it and don't have upgrades for 4.x. I'd say just because the impact on big libs is huge.

agronholm commented 3 months ago

Both FastAPI and httpx work with AnyIO 4.x. What was the problem?

gpsa commented 3 months ago

I figured it out. httpx version constraint was preventing the use of anyio 4.x. After updating it to the latest version, it allowed for all the cascading of updates to happen on both FastAPI and (specially) Starlette, then it upgraded anyio to 4.x

agronholm commented 3 months ago

Well, good that you figured it out. I'm closing this as it was resolved by the linked PR.