Open adriangb opened 2 months ago
This should release the GIL and allow use in multiple threads.
I tested with this script:
from contextlib import contextmanager import math from time import time from typing import Iterator import anyio import anyio.to_process import anyio.to_thread import object_store @contextmanager def timeit(name: str) -> Iterator[None]: start = time() yield print(f'{name} took {time() - start:.2f} seconds', flush=True) def work() -> None: object_store.ObjectStore('gs://adriangb-public-bucket').get('yellow_tripdata_2024-01 (1).parquet') async def awork(limiter: anyio.CapacityLimiter) -> None: await anyio.to_thread.run_sync(work, limiter=limiter) async def main() -> None: limiter = anyio.CapacityLimiter(math.inf) with timeit('main'): async with anyio.create_task_group() as tg: for _ in range(32): tg.start_soon(awork, limiter) if __name__ == '__main__': anyio.run(main)
Locally there isn't much difference, I'm IO bound. But on GCP compute that goes from ~15s to ~3s for me.
@roeap quick ping on this
@adriangb - sorry for being MIA for so long. Do you mind rebasing, and I am happy to review / merge then.
@roeap will you push a 0.2.0 release after this one?
This should release the GIL and allow use in multiple threads.
I tested with this script:
Locally there isn't much difference, I'm IO bound. But on GCP compute that goes from ~15s to ~3s for me.