bopen / c3s-eqc-toolbox-template

CADS Toolbox template application
Apache License 2.0
5 stars 4 forks source link

Problem using download_and_transform with split_all = True #30

Closed johtoblan closed 1 year ago

johtoblan commented 1 year ago

We have been using split_all = False earlier, to bulk download and prevent the earlier error mentioned in issue #14

from c3s_eqc_automatic_quality_control import download
collection_idh = "seasonal-monthly-single-levels"
request = {
    "format":            "grib",
    "originating_centre":"cmcc",
    "system":            "35",
    "variable":          "2m_temperature",
    "product_type":      "monthly_mean",
    "year":              [str(year) for year in range(1993, 2017)],
    "leadtime_month":    "2",
    "month":             "01",
}
dsh = download.download_and_transform(
    collection_idh,
    request,
    split_all=True,
)

This works for the 0th request (Welcome, sent request, queued and finally downloaded)

2023-03-22 20:14:34,755 INFO Downloading https://download-0005-clone.copernicus-climate.eu/cache-compute-0005/cache/data1/adaptor.mars.external-1679516026.1629727-14434-5-c335d4b5-f533-4048-8564-ef236e78220a.grib to adaptor.mars.external-1679516026.1629727-14434-5-c335d4b5-f533-4048-8564-ef236e78220a.grib (7.4M)

But after the 1st request is queued we get an error with this exception report:

Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3460, in run_code
  File "/tmp/ipykernel_13227/1025255193.py", line 1, in <module>
    dsh = download.download_and_transform(
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py", line 443, in download_and_transform
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py", line 354, in _download_and_transform_requests
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py", line 293, in get_sources
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cads_toolbox/catalogue.py", line 79, in data
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cads_toolbox/catalogue.py", line 68, in download
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cacholote/cache.py", line 104, in wrapper
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cads_toolbox/catalogue.py", line 36, in _download
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cdsapi/api.py", line 364, in retrieve
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cdsapi/api.py", line 500, in _api
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cdsapi/api.py", line 617, in wrapped
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/requests/sessions.py", line 600, in get
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/requests/adapters.py", line 460, in send
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/requests/adapters.py", line 263, in cert_verify
OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/certifi/cacert.pem

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/pygments/styles/__init__.py", line 82, in get_style_by_name
ModuleNotFoundError: No module named 'pygments.styles.default'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 2057, in showtraceback
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1288, in structured_traceback
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1177, in structured_traceback
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1030, in structured_traceback
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/IPython/core/ultratb.py", line 935, in format_exception_as_a_whole
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/IPython/core/ultratb.py", line 986, in get_records
  File "/data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/pygments/styles/__init__.py", line 84, in get_style_by_name
pygments.util.ClassNotFound: Could not find style module 'default', though it should be builtin.

We also have problems specifying area with split_all = True, but we do not have an error message for that right now, due to the queue

johtoblan commented 1 year ago

This is for the wp3 on the VM

johtoblan commented 1 year ago

Here is the traceback for the same request with split_all = False

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[1], line 15
      3 collection_id = "seasonal-monthly-single-levels"
      4 request = {
      5     "format":            "grib",
      6     "originating_centre":"cmcc",
   (...)
     12     "month":             "01",
     13 }
---> 15 dsh = download.download_and_transform(
     16     collection_id,
     17     request,
     18     split_all=False,
     19 )

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:450, in download_and_transform(collection_id, requests, chunks, split_all, transform_func, transform_func_kwargs, transform_chunks, logger, **open_mfdataset_kwargs)
    447     request_list.extend(split_request(request, chunks, split_all))
    449 if not transform_chunks or transform_func is None:
--> 450     ds = download_and_transform_requests(
    451         collection_id,
    452         tqdm.tqdm(request_list),
    453         transform_func,
    454         transform_func_kwargs,
    455         **open_mfdataset_kwargs,
    456     )
    457 else:
    458     # Cache each chunk separately
    459     sources = []

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:379, in _download_and_transform_requests(collection_id, request_list, transform_func, transform_func_kwargs, **open_mfdataset_kwargs)
    377         raise TypeError(f"`emohawk` returned {type(ds)} instead of a xr.Dataset")
    378 else:
--> 379     ds = xr.open_mfdataset(sources, **open_mfdataset_kwargs)
    381 if transform_func is not None:
    382     ds = transform_func(ds, **transform_func_kwargs)

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/xarray/backends/api.py:986, in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs)
    984 closers = [getattr_(ds, "_close") for ds in datasets]
    985 if preprocess is not None:
--> 986     datasets = [preprocess(ds) for ds in datasets]
    988 if parallel:
    989     # calling compute here will return the datasets/file_objs lists,
    990     # the underlying datasets will still be stored as dask arrays
    991     datasets, closers = dask.compute(datasets, closers)

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/xarray/backends/api.py:986, in <listcomp>(.0)
    984 closers = [getattr_(ds, "_close") for ds in datasets]
    985 if preprocess is not None:
--> 986     datasets = [preprocess(ds) for ds in datasets]
    988 if parallel:
    989     # calling compute here will return the datasets/file_objs lists,
    990     # the underlying datasets will still be stored as dask arrays
    991     datasets, closers = dask.compute(datasets, closers)

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:317, in _preprocess(ds, collection_id, preprocess)
    314 ds = cgul.harmonise(ds)
    316 # TODO: workaround: sometimes single timestamps are squeezed
--> 317 if "time" not in ds.cf.dims:
    318     if "forecast_reference_time" in ds.cf:
    319         ds = ds.cf.expand_dims("forecast_reference_time")

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cf_xarray/accessor.py:1394, in CFAccessor.__getattr__(self, attr)
   1393 def __getattr__(self, attr):
-> 1394     return _getattr(
   1395         obj=self._obj,
   1396         attr=attr,
   1397         accessor=self,
   1398         key_mappers=_DEFAULT_KEY_MAPPERS,
   1399         wrap_classes=True,
   1400     )

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cf_xarray/accessor.py:656, in _getattr(obj, attr, accessor, key_mappers, wrap_classes, extra_decorator)
    654     for name in inverted[key]:
    655         if name in newmap:
--> 656             raise AttributeError(
    657                 f"cf_xarray can't wrap attribute {attr!r} because there are multiple values for {name!r}. "
    658                 f"There is no unique mapping from {name!r} to a value in {attr!r}."
    659             )
    660     newmap.update(dict.fromkeys(inverted[key], value))
    661 newmap.update({key: attribute[key] for key in unused_keys})

AttributeError: cf_xarray can't wrap attribute 'dims' because there are multiple values for 'vertical'. There is no unique mapping from 'vertical' to a value in 'dims'.
malmans2 commented 1 year ago

Hi @johtoblan,

I'm on it, looks like a bug indeed but also an easy fix. I'm currently also working on another couple of improvements for the cache, so be aware that later today I might have to cleanup the cache of the VM. But I will re-populate it with the request you sent in the first message.

I'll shoot you a message here when I'll start cleaning up the cache and when I'm done.

malmans2 commented 1 year ago

Hi @johtoblan,

I tried a couple of years and I couldn't reproduce the issue. Now I'm trying with all years you had in your request, but it looks like the problem is originated by one or more years that are not consistent with each other.

I was hoping to find the data already cached on WP3, but that's not the case. It's taking a long time to download. Did you run that exact request? If not, could you please send me the request you've been running?

(Still, I think it's a bug that way our code is failing)

johtoblan commented 1 year ago

Hi @malmans2, the original request was with leadtime = "2" I see, but all the other parameters described above were the same. but our experience has also been that the error is a bit hard to reproduce

malmans2 commented 1 year ago

Hi, I'm a little confused. What is the request that returned this error?

Here is the traceback for the same request with split_all = False

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[1], line 15
      3 collection_id = "seasonal-monthly-single-levels"
      4 request = {
      5     "format":            "grib",
      6     "originating_centre":"cmcc",
   (...)
     12     "month":             "01",
     13 }
---> 15 dsh = download.download_and_transform(
     16     collection_id,
     17     request,
     18     split_all=False,
     19 )

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:450, in download_and_transform(collection_id, requests, chunks, split_all, transform_func, transform_func_kwargs, transform_chunks, logger, **open_mfdataset_kwargs)
    447     request_list.extend(split_request(request, chunks, split_all))
    449 if not transform_chunks or transform_func is None:
--> 450     ds = download_and_transform_requests(
    451         collection_id,
    452         tqdm.tqdm(request_list),
    453         transform_func,
    454         transform_func_kwargs,
    455         **open_mfdataset_kwargs,
    456     )
    457 else:
    458     # Cache each chunk separately
    459     sources = []

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:379, in _download_and_transform_requests(collection_id, request_list, transform_func, transform_func_kwargs, **open_mfdataset_kwargs)
    377         raise TypeError(f"`emohawk` returned {type(ds)} instead of a xr.Dataset")
    378 else:
--> 379     ds = xr.open_mfdataset(sources, **open_mfdataset_kwargs)
    381 if transform_func is not None:
    382     ds = transform_func(ds, **transform_func_kwargs)

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/xarray/backends/api.py:986, in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs)
    984 closers = [getattr_(ds, "_close") for ds in datasets]
    985 if preprocess is not None:
--> 986     datasets = [preprocess(ds) for ds in datasets]
    988 if parallel:
    989     # calling compute here will return the datasets/file_objs lists,
    990     # the underlying datasets will still be stored as dask arrays
    991     datasets, closers = dask.compute(datasets, closers)

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/xarray/backends/api.py:986, in <listcomp>(.0)
    984 closers = [getattr_(ds, "_close") for ds in datasets]
    985 if preprocess is not None:
--> 986     datasets = [preprocess(ds) for ds in datasets]
    988 if parallel:
    989     # calling compute here will return the datasets/file_objs lists,
    990     # the underlying datasets will still be stored as dask arrays
    991     datasets, closers = dask.compute(datasets, closers)

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:317, in _preprocess(ds, collection_id, preprocess)
    314 ds = cgul.harmonise(ds)
    316 # TODO: workaround: sometimes single timestamps are squeezed
--> 317 if "time" not in ds.cf.dims:
    318     if "forecast_reference_time" in ds.cf:
    319         ds = ds.cf.expand_dims("forecast_reference_time")

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cf_xarray/accessor.py:1394, in CFAccessor.__getattr__(self, attr)
   1393 def __getattr__(self, attr):
-> 1394     return _getattr(
   1395         obj=self._obj,
   1396         attr=attr,
   1397         accessor=self,
   1398         key_mappers=_DEFAULT_KEY_MAPPERS,
   1399         wrap_classes=True,
   1400     )

File /data/common/mambaforge/envs/wp3/lib/python3.10/site-packages/cf_xarray/accessor.py:656, in _getattr(obj, attr, accessor, key_mappers, wrap_classes, extra_decorator)
    654     for name in inverted[key]:
    655         if name in newmap:
--> 656             raise AttributeError(
    657                 f"cf_xarray can't wrap attribute {attr!r} because there are multiple values for {name!r}. "
    658                 f"There is no unique mapping from {name!r} to a value in {attr!r}."
    659             )
    660     newmap.update(dict.fromkeys(inverted[key], value))
    661 newmap.update({key: attribute[key] for key in unused_keys})

AttributeError: cf_xarray can't wrap attribute 'dims' because there are multiple values for 'vertical'. There is no unique mapping from 'vertical' to a value in 'dims'.

If I run the code you have in the first comment it looks like the download has not been cached, and I can't reproduce the error right away.

annemo1976 commented 1 year ago

Hi

I hope it is OK that I answere this Johannes? Johannes and I looked at this together earlier today. Did you use all 24 year from 1993 to 2016 using split_all=True ? Usually I get this error in the first comment after a while, so the first 5-6 downloads in the request works fine, and when it come to f.ex. 5/24 I get the error above. When this error occurs I have been stuck in queue for a while. When I use split_all=False I get AttributeError, except when I am using leadtime_month="1" . Leadtime_month="1" seems to work fine.

The request that gave the error in the first comment was:

import warnings
from c3s_eqc_automatic_quality_control import diagnostics, download, plot
warnings.filterwarnings("ignore")
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import cm

centre           = "cmcc" # Data centre
system           = "35"   # Model version
variable_long    = "2m_temperature" # Variable name in download request
variable_short   = "t2m" # Variable name after download
reanalysis_month = "02" # Must correspond with forecast month + leadtime
reanalysis_year  = [str(year) for year in range(1993, 2017)] # Reanalysis years
hindcast_year    = [str(year) for year in range(1993, 2017)] # Hindcast years
hindcast_month   = "02" # Model start month
leadtime         = "1" # Leadtime month
area_name        = "global" # Name of area
area_coordinates = [89.5,-179.5,-89.5,179.5] # Area [maxlat,minlon,minlat,maxlon]

collection_idh = "seasonal-monthly-single-levels"
requesth = {
    "format":            "grib",
    "originating_centre": centre,
    "system":             system,
    "variable":           variable_long,
    "product_type":      "monthly_mean",
    "year":               hindcast_year,
    "leadtime_month":     leadtime,
    "month":              hindcast_month,
}

dsh = download.download_and_transform(
    collection_idh,
    requesth,
    split_all=True,
)
malmans2 commented 1 year ago

Hi there,

The CDS queue is very long right now. I've been able to download a dozen of the datasets using the code in the first comment, so far so good. I need somehow to reproduce the error to debug. But I'm on it!

I've added a few improvements to the cache (things like requests dict order or type don't matter anymore), and a couple of more informative errors, so at least we'll have a better clue of what's going on. I'm going to wipe the cache and update the environments on the VM tonight, hopefully from now on things will get a bit smoother.

(BTW, if you are all using the same credentials, I suggest that everyone starts using their personal cdsapirc. Maybe it will help the queueing time. See the top cell here: https://github.com/bopen/c3s-eqc-toolbox-template/blob/main/notebooks/01-Application_Template_Overview.ipynb)

malmans2 commented 1 year ago

~@annemo1976 sorry I didn't realise it right away, but in the request you shared you can not use split_all.~

~split_all splits all parameters that are iterables (but not strings) into single requests. area_coordinates must be a list though, I'm actually surprised the CDS allowed the request. So in your case, you need to pass chunks explicitly, so for example you can do chunks={"reanalysis_year": 1, "hindcast_year": 1}~

~Anyways, I'm clearing the cache in a bit and I will run the request overnight. Hopefully we'll wake up with those request cached.~

Oops nevermind, area_coordinates was not actually used in the request.

annemo1976 commented 1 year ago

Thank you for looking into it. After 6 hours I am on 11/24 and it is still running, I have never got this far before :-) Is it possible to download the hindcast data in only one request as I did last week? Last week I could use split_all=False without getting the AttributeError, and I get all hindcast data in one request and it took only 15-30 min.
Now I also understand why area only work when I use spit_all=False :-)

malmans2 commented 1 year ago

Hi @johtoblan and @annemo1976,

Good and baddish news.

Let's start with the good one: I ran a script overnight to populate using concurrent calls and you now have some data available. Here is how you can open the dataset:

from c3s_eqc_automatic_quality_control import download

year_start = 1993
year_stop = 2016

collection_id = "seasonal-monthly-single-levels"
request = {
    "year": [str(year) for year in range(year_start, year_stop + 1)],
    "originating_centre": "cmcc",
    "system": "35",
    "variable": "2m_temperature",
    "product_type": "monthly_mean",
    "month": [f"{month:02d}" for month in range(1, 12 + 1)],
    "leadtime_month": ["1"],
    "format": "grib",
}

xr_open_mfdataset_kwargs = {
    "concat_dim": "forecast_reference_time",
    "combine": "nested",
    "parallel": True,
}
ds = download.download_and_transform(
    collection_id,
    request,
    chunks={"year": 1, "leadtime_month": 1},
    **xr_open_mfdataset_kwargs,
)

So you'll notice that in your case I had to pass a few arguments to xarray. Besides for parallel that is just parsing the metadata a little faster, concat_dim and combine are to be used because the datasets downloaded are a little weird (but I don't know a lot about this dataset, maybe it's al good). What's happening is that there's a leadtime dimension. All timestamps have only a leadtime, but because it's not always the same the dimension has size 4. Things get even more complicated because all years have 3 unique leadtimes, so the actual dimension in the raw data is 3.

Let's see if this explains better what I mean:

ds.sizes
Frozen({'realization': 40, 'leadtime': 4, 'latitude': 180, 'longitude': 360, 'forecast_reference_time': 288})
# Compute where values are all nans
da = ds["t2m"].isnull().all(set(ds.dims) - {"forecast_reference_time", "leadtime"}).compute()
da.attrs["long_name"] = "0: valid values; 1: all NaNs"
# 3 leadtime out of 4 are always nan
set(da.sum("leadtime").values)
{3}
da.plot(row="leadtime", marker=".", ls="none")

output_4_1

This is very inefficient as 75% of the dataset is NaNs. To me leadtime should be just a 1D coordinate with dimension forecast_reference_time.

malmans2 commented 1 year ago

Do you know if the issue just shows up with this chunking (yearly data with just 1 leadtime)? In your case we should optimise the chunking as much as possible, so everyone will use the same cached data.

Was this request just a test? What kind of data do you need for your use case (more years? more leadtimes? more variables...?)

Let me know!

annemo1976 commented 1 year ago

Hi Mattia

Unfortunately the different number of days in leadtime makes it more difficult. When downloading directly from the api, it is possible to use the following command when opening the dataset i xarray:

ds = xr.open_dataset(filename, engine='cfgrib', backend_kwargs=dict(time_dims=('forecastMonth', 'time')))

This converts the leadtime from days to forecastMonth (more info here: https://ecmwf-projects.github.io/copernicus-training-c3s/sf-anomalies.html). Could this be a possibility for the toolbox also? If not, I understand that I need to split the download into multiple requests. To spit it up is only a problem since it takes a very long time in queue of course, and we will bring this up as an issue at the meeting in Rome. The code you provided above works fine, so thank you very much :-) I will also start to use chunks. The request we sent was a part of a notebook to compare the uncertainty of a forecast compared to climatology (where the datacentre, model version, forecast_year, month, leadtime, variable and area is given at the top op the notebook and can easily be changed). Her is the input to the notebook: centre = "cmcc" # Data centre system = "35" # Model version variable_long = "2m_temperature" # Variable name in download request forecast_year = "2023" # Forecast year hindcast_year = [str(year) for year in range(1993, 2017)] # Hindcast years month = "01" # Model start month leadtime = "1" # Leadtime month area_name = "global" # Name of area area_coordinates = [89.5,-179.5,-89.5,179.5] # Area [maxlat,minlon,minlat,maxlon] The notebook only uses data from one month and one leadtime, but need 24 years of data to calculate the mean uncertainty form the hindcastyears 1993-2016.

Best regards Anne-Mette

malmans2 commented 1 year ago

Good news! I think the kwargs you provided are working as expected. So here is the final code to deal with this dataset:

from c3s_eqc_automatic_quality_control import download

year_start = 1993
year_stop = 2016

collection_id = "seasonal-monthly-single-levels"
request = {
    "year": [str(year) for year in range(year_start, year_stop + 1)],
    "originating_centre": "cmcc",
    "system": "35",
    "variable": "2m_temperature",
    "product_type": "monthly_mean",
    "month": [f"{month:02d}" for month in range(1, 12 + 1)],
    "leadtime_month": ["1"],
    "format": "grib",
}

ds = download.download_and_transform(
    collection_id,
    request,
    chunks={"year": 1, "leadtime_month": 1},
    backend_kwargs={"time_dims": ('forecastMonth', 'time')},
)
print(ds.dims)
Frozen({'realization': 40, 'forecast_reference_time': 288, 'latitude': 180, 'longitude': 360})
malmans2 commented 1 year ago

Could this be a possibility for the toolbox also? If not, I understand that I need to split the download into multiple requests. To spit it up is only a problem since it takes a very long time in queue of course, and we will bring this up as an issue at the meeting in Rome.

Yes, let's talk about it next week. But here is the general idea. If we coordinate well, the downside of long downloading time should be negligible when things will be more stable. If everyone uses the same chunking in WP3, they will find most of the data they need already cached, and it's just a matter of downloading new data as they get available. This is why I'd suggest to only use the chunking in the code above.

If you need more/different data, just send me the request and I will run the scripts. As I briefly mentioned, I now have scripts to quickly populate the cache (also useful in case we need to clear the cache). In the future, ECMWF might also assign us a priority user.

annemo1976 commented 1 year ago

Hi Mattia

Very nice that we can use 'backend_kwargs' in the toolbox as well :-) By using the code above I can also use area that interpolate the data to a correct grid. Chunking and caching of data seems like a good option, we can talk more about this next week in Rome. Today downloading data from CDS is much faster. What took forever yesterday, takes only a couple of minutes today. Thank you very much for looking into this!! See you next week in Rome :-)

Anne-Mette

malmans2 commented 1 year ago

Hi WP3ers, I'm closing this as it looks like it's now fixed and it makes easier for us to track progress. Feel free to open new issues though!