When files are served over the network, the server must encode certain characters using percent-encoding (RFC3986 2.2).
When Zarr opens a dataset from a URL, keys are incorrectly set from percent-encoded file names.
Steps to reproduce
Create a dataset containing any of (:/) ?#[]@!$&'()*+,;=.
Here, the array key contains +.
import numpy as np
import zarr
g = zarr.open_group("dataset.zarr")
g.create_dataset(name="a+b", data=np.eye(3))
Serve the dataset with a local server. Go into the directory where you saved the data and run:
python -m http.server
In a web browser you can confirm that the URLs are correctly percent-encoded, but the file listing is decoded:
Try reading the dataset from a URL:
>>> g = zarr.open("http://0.0.0.0:8000/dataset.zarr/")
>>> list(g.keys())
['a%2Bb']
Zarr version
v2.17.1
Numcodecs version
v0.12.1
Python Version
3.10
Operating System
Linux
Installation
Using pip into a conda environment
Description
When files are served over the network, the server must encode certain characters using percent-encoding (RFC3986 2.2). When Zarr opens a dataset from a URL, keys are incorrectly set from percent-encoded file names.
Steps to reproduce
Create a dataset containing any of (
:
/
)?
#
[
]
@
!
$
&
'
(
)
*
+
,
;
=
. Here, the array key contains+
.Serve the dataset with a local server. Go into the directory where you saved the data and run:
In a web browser you can confirm that the URLs are correctly percent-encoded, but the file listing is decoded:
Try reading the dataset from a URL:
Additional output
No response