zarr-developers / zarr-python

An implementation of chunked, compressed, N-dimensional arrays for Python.
http://zarr.readthedocs.io/
MIT License
1.44k stars 272 forks source link

[v3] `StoreTests[RemoteStore]` #1956

Closed d-v-b closed 2 months ago

d-v-b commented 2 months ago

Attempts to specialize StoreTests with RemoteStore. Depends on #1785.

This is currently broken because I can't figure out the correct way to link up the s3 fixture with the StoreTests instance. Help would be appreciated!

TODO:

d-v-b commented 2 months ago

We are getting failures like this:

FAILED tests/v3/test_store/test_remote.py::TestRemoteStoreS3::test_get_partial_values[key_ranges3] - botocore.exceptions.HTTPClientError: An HTTP Client raised an unhandled exception: Task <Task pending name='Task-643293' coro=<_run_coros_in_chunks.<locals>._run_coro() running at /home/runner/.local/share/hatch/env/virtual/zarr/nOVTQ5-s/test.py3.11-1.24-optional/lib/python3.11/site-packages/fsspec/asyn.py:245> cb=[_wait.<locals>._on_completion() at /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/asyncio/tasks.py:519]> got Future <Future pending cb=[BaseSelectorEventLoop._sock_write_done(17, handle=<Handle BaseS....0.1', 5555))>)()]> attached to a different loop

cc @martindurant in case you have any insights here. This builds off #1785, but since I'm touching a lot of code here, I spun it out into a separate PR (and I opened a corresponding PR against your branch, which you can ignore if you would rather look here).

jhamman commented 2 months ago

Thanks @d-v-b for advancing this!

joshmoore commented 1 day ago

A heads up that I am likely in the throws of the same or at least similar issues here:

https://github.com/ome/ome2024-ngff-challenge/pull/37

The mock tests failed but also a production run against a production but non-standard --endpoint-url is failing with:

botocore.exceptions.HTTPClientError: An HTTP Client raised an unhandled exception: Task <Task pending name='Task-13' coro=<_runner() running at /opt/homebrew/Caskroom/mambaforge/base/envs/challenge4/lib/python3.12/site-packages/zarr/sync.py:51> cb=[_chain_future.<locals>._call_set_state() at /opt/homebrew/Caskroom/mambaforge/base/envs/challenge4/lib/python3.12/asyncio/futures.py:394]> got Future <Future pending cb=[BaseSelectorEventLoop._sock_write_done(10, handle=<Handle BaseS...96.70', 443))>)()]> attached to a different loop

I'll read through @d-v-b's history and see if it's the same problem or not.

Edit: yes, from https://github.com/zarr-developers/zarr-python/pull/1956#issuecomment-2158372784 it's the same. So a production bug and not just a testing issue. Note: I'm using the sync API.

cc: @martindurant

martindurant commented 1 day ago

If the s3fs instance is to be used in async mode (which is the case), then it must be created within a coroutine running on the same event loop and thread as when it will be called. You can assign the loop= kwargs and defer creation of the session object, but it's better to follow the above rule.

but non-standard --endpoint-url

This will make a new instance, which is probably the difference. fsspec will reuse instances having exactly the same set of init arguments unless otherwise directed.

joshmoore commented 1 day ago

Thanks, @martindurant! That makes sense conceptually, but can you give me a pointer on how to do that reliably for the new RemoteStore?

martindurant commented 1 day ago

I suppose RemoteStore(...) must also be done in a coroutine and either pass asynchronous=True explicitly or ensure that no other s3fs instance was created beforehand. Although I helped make the class, it was far from clear to me how and when it was to be used...