Working with index files and xr.open_mfdataset in a location without write permission

ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes

Apache License 2.0

407 stars 77 forks source link

Working with index files and xr.open_mfdataset in a location without write permission #275

Open matzech opened 2 years ago

matzech commented 2 years ago

I work with multiple grib files in a location without write permission. In locations with writing permission, I usually work with open_mfdataset from xarray like

ds = xr.open_mfdataset("201506/*.grib", engine="cfgrib")

As I would like to work with index files, I tried to follow the approach proposed in #126 , but the short_hash is actually the same for the different files when applied like:

ds = xr.open_mfdataset("201506/*.grib", engine="cfgrib", backend_kwargs={"indexpath":'mygrib.{short_hash}.idx'} Is there another solution?

alexamici commented 2 years ago

@matzech you may pass {"indexpath": ""} to disable writing the index, or you may try something along the line of (untested):

ds = xr.open_mfdataset("*.grib", engine="cfgrib", backend_kwargs={"indexpath": 'writable_folder{path}.{short_hash}.idx'})

onnyyonn commented 2 years ago

Is it possible to modify the {path} so that it contains only the basename? Also, what keys are available other than {path} and {short_hash}?

FRidh commented 1 year ago

As suggested in pydata/xarray#6512 I think we should use a temporary file/folder as default or disable it by default. It cannot be relied on that the folder the file is in is writeable nor that the index file is accurate. The index file is purely an optimization. In my opinion robustness is more important than performance.