Closed raphaeldussin closed 4 years ago
Using fsspec directly does work, although the incantation s a little strange:
temp_zip:
description: zarr zipstore
driver: zarr
args:
urlpath: 'zip://'
storage_options:
fo: 'temp_zip.zip'
consolidated: False
Just like with fsspec.open
, fsspec.get_mapper
ought to support compound URLs, in this case "zip::temp_zip.zip" (and no extra storage options), but this is not yet done.
Hi @martindurant and thanks for the help. I'm trying this out rn and I can make it work for consolidated ZipStore but not un-consolidated. Any idea about what the problem might be:
temp_zip_martin:
description: zarr zipstore
driver: zarr
args:
urlpath: 'zip://'
storage_options:
fo: 'temp_zip.zip'
consolidated: False
metadata:
origin_url: ''
the error is:
KeyError: "There is no item named '.zattrs/.zarray' in the archive"
my hunch is that it is blind to the difference between directories and files and expect to find a .zarray under .zattrs which is just a file.
what do you think?
The exact same thing did work for me, unconsolidated. Are you on windows by any chance?
my config is:
OSX Catalina anaconda python 3.6.7 intake v 0.5.5 fsspec v 0.6.2
Would you mind trying with fsspec master (or indeed the chained_mapper branch)?
I've tried both and can't get it to work either. weird...
can you post your test?
!cat catalog.yml
sources:
temp_zip:
description: zarr zipstore
driver: zarr
args:
urlpath: 'zip://'
storage_options:
fo: temp_zip.zip
consolidated: False
metadata:
origin_url: ''
ds = xr.Dataset()
ds['temp'] = xr.DataArray(data=np.random.rand(180, 360), dims=('lat', 'lon'))
ds['lon'] = xr.DataArray(data=np.arange(360), dims=('lon'))
ds['lat'] = xr.DataArray(data=np.arange(-90,90), dims=('lat'))
store = zarr.ZipStore('temp_zip.zip', mode='w')
ds.to_zarr(store, consolidated=False, mode='w')
store.close(). # seems to require kernel restart here, write is incomplete
cat = intake.open_catalog('catalog.yml')
cat.temp_zip.to_dask()
great! I'm gonna call docker to the rescue to make sure it's not due to something in my config. will get back to you soon
Perhaps if you show me how you are calling docker, I can follow it. Obviously, it would be good to test this kind of thing here, but fsspec should be released first. Are you aware of https://github.com/intake/intake-xarray/pull/66 ? That might make things more consistent.
@martindurant I'm going to create and point you to a github repo with my dockerfiles once I'm done with afternoon meetings. So far fsspec master works with the storage_option but your new branch (https://github.com/intake/filesystem_spec/pull/282) does not work with the new syntax.
fsspec master works with the storage_option
This is progress!
@martindurant docker tests here: https://github.com/raphaeldussin/docker_tests_zipstore.git
The chained version doesn't seem to install the right version
git+https://github.com/martindurant/filesystem_spec.git#chained_mapper
"@"
?In my case, to test, I did extra steps to be certain
git clone https://github.com/martindurant/filesystem_spec
cd filesystem_spec
git checkout chained_mapper
pip3 install -e
cd ..
(and it worked)
@ and # keys are too close for my own good... this works for me as well now.
the change to fsspec is enough for my application. This PR has no reason to be anymore. many thanks @martindurant
intake-xarray is presently not able to work with zarr.ZipStore. The issue seems to be coming from the fsspec mapping. From what I understood of fsspec mappings, I am not sure if it is able to handle local zip files or if it's desirable. The proposed fix adds an option to bypass the mapper and use the local path to the file.
problem and solution can be tested with: