Closed rsignell closed 8 months ago
Of course, there are a number of PRs in flight to get the dask open time much closer to the non-dask one, so part of the answer is "wait".
However, .read()
I think gives you the regular un-dask xarray object, with the usual lazy access on the variables.
However, .read() I think gives you the regular un-dask xarray object, with the usual lazy access on the variables.
I tried .read()
and I let it run for about 1 minute before killing it. Seemed like it was loading the data!
Mm, OK. Then you can do instead:
coawst.chunks = None
coawst.discover()
ds = coawst._ds
Tried it. Also takes 20s:
intake_catalog_url = 's3://usgs-coawst/useast_archive/coawst_useast.yml'
cat = intake.open_catalog(intake_catalog_url)
coawst = cat['COAWST_USEAST_Archive']
coawst.chunks = None
coawst.discover()
ds = coawst._ds
This now takes about 1 s, so closing!
I have a kerchunked dataset that loads in about 20s if I use Dask, and about 1s if I don't:
When I want to use Intake to open into Xarray, I have always used
to_dask()
(Method 1):I tried
.to_chunked()
and it took the same amount of time as.to_dask()
How can I specify Method 2 using Intake (and get the datasets opening in a few seconds intead of 15-30!)?