Closed elyall closed 12 months ago
Thanks for the detailed report. Unfortunately I don't know how to reproduce this as all the remote s3 buckets I have access to are via https, e.g This works OK:
napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0001A/2551.zarr
We have access to a minio server, but I don't know how to replicate your workflow there.
I see you're using terraform
but I guess that needs more than a local install on my Mac?
A local install of terraform
paired with an AWS account would work fine. But to make life easier I've temporarily made a publicly readable bucket and saved an HCS dataset to an ome-zarr file there. Please try using this file for testing: s3://test-bucket-xbahruvc/test.ome.zarr
.
To generate this file I saved skimage.data.cells3d
to path A/2/0
. No other wells exist in the dataset defined as a 96 well plate.
viewer.open(path, plugin="napari-ome-zarr")
produces the error I mentionedviewer.open(path + "/A/2/0", plugin="napari-ome-zarr")
works fineThe issue is in ome_zarr.io.ZarrLocation.subpath which can't handle FSStore.fs.protocol == ["s3","s3a"]
. It treats the S3 path as a url by passing it to urllib.parse.urljoin
when it should be treated like a file path. This also means that Google Cloud Storage paths and Azure Blob Storage paths would also fail to be handled correctly. There's a couple simple ways to deal with this:
AnyPath
class that wraps both pathlib.Path
and cloudpathlib.CloudPath
(which is itself a wrapper for cloudpathlib
's S3Path
, GSPath
, and AzureBlobPath
). Then update the conditional so that "file"
, ["s3","s3a"]
, and the equivalent for Google Cloud Storage and Azure Blob Storage are treated the same and not passed to urljoin
.s3://
, gs://
, az://
paths specifically.if self.__store.fs.protocol in ["http", "https"]
, and everything else is treated as a file path (using string handling instead of Path
which doesn't generalize to cloud paths or adding in cloudpathlib.AnyPath
).Thanks for the proposed solutions... If options 2 or 3 are sufficient then it would be nicer not to add a dependency on cloudpathlib.
I was just trying to view your data with:
viewer.open("s3://test-bucket-xbahruvc/test.ome.zarr/A/2/0", plugin="napari-ome-zarr")
but even this isn't working for me. I get ValueError: Invalid endpoint: https://s3..amazonaws.com
and I don't understand where that is coming from!
Traceback (most recent call last):
File "/Users/wmoore/Desktop/ZARR/ome-zarr-py/ome_zarr/io.py", line 150, in get_json
data = self.__store.get(subpath)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/_collections_abc.py", line 763, in get
return self[key]
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/storage.py", line 1393, in __getitem__
return self.map[key]
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/mapping.py", line 143, in __getitem__
result = self.fs.cat(k)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/asyn.py", line 115, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync
raise return_result
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner
result[0] = await coro
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/asyn.py", line 414, in _cat
raise ex
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/asyncio/tasks.py", line 442, in wait_for
return await fut
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 1050, in _cat_file
return await _error_wrapper(_call_and_read, retries=self.retries)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 139, in _error_wrapper
raise err
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 112, in _error_wrapper
return await func(*args, **kwargs)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 1037, in _call_and_read
resp = await self._call_s3(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 340, in _call_s3
await self.set_session()
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 526, in set_session
self._s3 = await s3creator.__aenter__()
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/session.py", line 26, in __aenter__
self._client = await self._coro
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/session.py", line 193, in _create_client
client = await client_creator.create_client(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/client.py", line 59, in create_client
client_args = self._get_client_args(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/client.py", line 262, in _get_client_args
return args_creator.get_client_args(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/args.py", line 70, in get_client_args
endpoint = endpoint_creator.create_endpoint(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/endpoint.py", line 308, in create_endpoint
raise ValueError("Invalid endpoint: %s" % endpoint_url)
ValueError: Invalid endpoint: https://s3..amazonaws.com
Tried this on a couple of different conda environments.. Even without napari, I get the same error with:
from ome_zarr import io
zl = io.ZarrLocation("s3://test-bucket-xbahruvc/test.ome.zarr/A/2/0")
Googling the error I get this which says your aws profile is likely not configured correctly. From your error it looks like the path is passed off to s3fs
and botocore
which is correct, but I'm guessing even when accessing data on a public bucket those packages likely need awscli
configured correctly. You can install awscli and run aws configure
making sure to pass a valid region (example list here). But you may also need a valid AWS Access Key and AWS Secret Access Key which would require creating an aws account, generating an IAM user, and creating an access key for the user. I'd first try just installing awscli
and running aws configure
and leave those first two as the default None
and then set a valid region (e.g. us-east-1
). If that doesn't solve the problem then I can generate a temporary AWS user for you but we'd need to connect via private chat to hand them off as AWS scrapes github looking for leaked credentials.
I did create a solution that doesn't requiring adding a dependency, and I've successfully tested it on the example data above. You can find it at this pull request: https://github.com/ome/ome-zarr-py/pull/322.
@will-moore, another option for testing is to create an fsspec config file:
cat ~/.config/fsspec/conf.json
{
"s3": {
"anon": true
}
}
@joshmoore Thanks, but having created that config.json
I'm still seeing the same behaviour :(
$ cat ~/.config/fsspec/conf.json
{
"s3": {
"anon": true
}
}
Sorry, which of the behaviors?
I get the same Error when trying to open with napari as I do with this:
>>> from ome_zarr import io
>>> zr = io.ZarrLocation("s3://test-bucket-xbahruvc/test.ome.zarr/A/2/0")
Error while loading JSON
Traceback (most recent call last):
File "/Users/wmoore/Desktop/ZARR/ome-zarr-py/ome_zarr/io.py", line 150, in get_json
data = self.__store.get(subpath)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/_collections_abc.py", line 763, in get
return self[key]
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/zarr/storage.py", line 1393, in __getitem__
return self.map[key]
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/mapping.py", line 143, in __getitem__
result = self.fs.cat(k)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/asyn.py", line 115, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync
raise return_result
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner
result[0] = await coro
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/fsspec/asyn.py", line 414, in _cat
raise ex
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/asyncio/tasks.py", line 442, in wait_for
return await fut
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 1050, in _cat_file
return await _error_wrapper(_call_and_read, retries=self.retries)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 139, in _error_wrapper
raise err
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 112, in _error_wrapper
return await func(*args, **kwargs)
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 1037, in _call_and_read
resp = await self._call_s3(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 340, in _call_s3
await self.set_session()
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/s3fs/core.py", line 526, in set_session
self._s3 = await s3creator.__aenter__()
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/session.py", line 26, in __aenter__
self._client = await self._coro
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/session.py", line 193, in _create_client
client = await client_creator.create_client(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/client.py", line 59, in create_client
client_args = self._get_client_args(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/client.py", line 262, in _get_client_args
return args_creator.get_client_args(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/args.py", line 70, in get_client_args
endpoint = endpoint_creator.create_endpoint(
File "/Users/wmoore/opt/anaconda3/envs/omeroweb2/lib/python3.9/site-packages/aiobotocore/endpoint.py", line 308, in create_endpoint
raise ValueError("Invalid endpoint: %s" % endpoint_url)
ValueError: Invalid endpoint: https://s3..amazonaws.com
Hmmm.... ok. I don't see the problem with or without the config. Either it's due to my platform or some combination of libraries:
Hmm - I seem to generally have slightly older packages than you (and a load of omero-web stuff).
Tried with a fresh conda env...
conda create -n omezarr python=3.9
conda activate omezarr
pip install ome-zarr
python
Python 3.9.18 (main, Sep 11 2023, 08:38:23)
[Clang 14.0.6 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from ome_zarr import io
>>> zr = io.ZarrLocation("s3://test-bucket-xbahruvc/test.ome.zarr/A/2/0")
This might be relevant, the test file is in a bucket in AWS region us-west-2
.
@will-moore ValueError: Invalid endpoint: https://s3..amazonaws.com
does show that somehow you're not incorporating the AWS region correctly as a valid endpoint would be something like https://s3.us-west-2.amazonaws.com
where the region is incorporated.
I would like to get the pull request incorporated and keep it from going stale. How do we do that? I would also like to take that test file down.
Thanks for that info... I tried updating my fsspec config like this, and it worked!
$ cat ~/.config/fsspec/conf.json
{
"s3": {
"anon": true,
"endpoint_url": "https://s3.us-west-2.amazonaws.com"
}
}
With that, and your PR #322, that fixed the Could not find first well
Exception and I was able to view the single-Well Plate
napari --plugin napari-ome-zarr "s3://test-bucket-xbahruvc/test.ome.zarr"
When trying to read an HCS file on AWS S3 (e.g.
s3://bucket/file.ome.zarr
) the reader returnsException: Could not find first well
. Reading a specific well from the HCS file works fine (e.g.s3://bucket/file.ome.zarr/A/2/0
).The issue seems to be here where
self.zarr.create(self.well_paths[0])
:os.getcwd() + "/" + self.well_paths[0]
"s3://bucket/file.ome.zarr/" + self.well_paths[0]
Steps to reproduce
code
Create a new directory and save the following to
main.tf
:Update the two variables then run in the terminal:
Later to destroy the infrastructure run:
code
Update
path
to reflect the name of the bucket created then execute the following code to generate the file:code
Error
full error
```shell --------------------------------------------------------------------------- Exception Traceback (most recent call last) File [napari/components/viewer_model.py:1092], in ViewerModel.open(self, path, stack, plugin, layer_type, **kwargs) 1089 _path = [_path] if not isinstance(_path, list) else _path 1090 if plugin: 1091 added.extend( -> 1092 self._add_layers_with_plugins( 1093 _path, 1094 kwargs=kwargs, 1095 plugin=plugin, 1096 layer_type=layer_type, 1097 stack=_stack, 1098 ) 1099 ) 1100 # no plugin choice was made 1101 else: 1102 layers = self._open_or_raise_error( 1103 _path, kwargs, layer_type, _stack 1104 ) File [napari/components/viewer_model.py:1292], in ViewerModel._add_layers_with_plugins(self, paths, stack, kwargs, plugin, layer_type) 1290 else: 1291 assert len(paths) == 1 -> 1292 layer_data, hookimpl = read_data_with_plugins( 1293 paths, plugin=plugin, stack=stack 1294 ) 1296 # glean layer names from filename. These will be used as *fallback* 1297 # names, if the plugin does not return a name kwarg in their meta dict. 1298 filenames = [] File [napari/plugins/io.py:77], in read_data_with_plugins(paths, plugin, stack) 74 assert len(paths) == 1 75 hookimpl: Optional[HookImplementation] ---> 77 res = _npe2.read(paths, plugin, stack=stack) 78 if res is not None: 79 _ld, hookimpl = res File [napari/plugins/_npe2.py:63], in read(paths, plugin, stack) 61 npe1_path = paths[0] 62 try: ---> 63 layer_data, reader = io_utils.read_get_reader( 64 npe1_path, plugin_name=plugin 65 ) 66 except ValueError as e: 67 # plugin wasn't passed and no reader was found 68 if 'No readers returned data' not in str(e): File [npe2/io_utils.py:66], in read_get_reader(path, plugin_name, stack) 62 if stack is None: 63 # "npe1" old path 64 # Napari 0.4.15 and older, hopefully we can drop this and make stack mandatory 65 new_path, new_stack = v1_to_v2(path) ---> 66 return _read( 67 new_path, plugin_name=plugin_name, return_reader=True, stack=new_stack 68 ) 69 else: 70 assert isinstance(path, list) File [npe2/io_utils.py:165], in _read(paths, stack, plugin_name, return_reader, _pm) 160 read_func = rdr.exec( 161 kwargs={"path": paths, "stack": stack, "_registry": _pm.commands} 162 ) 163 if read_func is not None: 164 # if the reader function raises an exception here, we don't try to catch it --> 165 if layer_data := read_func(paths, stack=stack): 166 return (layer_data, rdr) if return_reader else layer_data 168 if plugin_name: File [npe2/manifest/contributions/_readers.py:60], in ReaderContribution.exec.Versions tested
ome-zarr = 0.8.2 and 0.8.3.dev0 napari-ome-zarr = 0.5.2 python = 3.10.12
Fyi, I first mentioned this error here.