Closed aaronspring closed 4 years ago
I really don't know the mechanism of netcdf/dods (or opendap) - does xarray or it's backend treat such a URL specially? Accessing your URL doesn't actually download any data, just shows a HTML page about the data, so I guess something else is happening. Caching in Intake, whether the "old" version you are trying to use here or the new version in the fsspec layer, needs the actual URL of the data, so that you can download it and point to the local copy either by path or by file object.
ds = xr.open_dataset('http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.CESM/.46LCESM1/.hindcast/.ua/dods')
ds.dims
Frozen(SortedKeysDict({'S': 887, 'M': 10, 'X': 360, 'L': 45, 'Y': 181, 'P': 2}))
but I don't understand why this works because I also only get this HTML file in the browser.
ok, so without a url pointing to a file (with .nc or another ending) caching wouldnt work.
You could maybe use .persist()
or .export()
as convenient ways to transform the data to zarr and save locally. Not what you were after...
I was naively hoping for intake to do that. but I can easily build my own caching system here.
Probably someone at xarray can think about how to do this automatically
I want to process forecasts from
iridl.ldeo.columbia.edu/
. Many models, many variables, forecast or hindcast, all follow the same URL pattern. Data is stored on dods likehttp://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.CESM/.46LCESM1/.hindcast/.ua/dods
I build a catalog. It works. For multiple use, I want to use intake caching, this fails.
SubX.yml:
The goal would be to subset first some
S
orL
first and then cache.When I uncomment the cache lines, it does not work. I browsed the docs for
intake
andintake-xarray
, except for the examples I didnt find much information about how to use caching.Question: Is the combination of DODS netcdf and caching even theoretically possible? If so, any suggestions for how to configure
cache
in the catalog?I checked the code base of
intake-xarray
and it seems like all the caching is inherited fromintake
, however, I hoped to find an answer to this question rather here, because I think its more connected to netcdf and DODS.