xcube-dev / xcube-cds

An xcube plugin to generate data cubes from the Climate Data Store (CDS) API
MIT License
3 stars 1 forks source link

Update xcube-cds to work with new CDS backend API #84

Closed pont-us closed 4 days ago

pont-us commented 1 month ago

The current CDS API servers are scheduled to be shut down on 2024-09-03. As of 2024-07-25, ECMWF has not yet deployed any production-grade replacement, but there is a beta version of the planned successor at https://cds-beta.climate.copernicus.eu/. xcube-cds will need to support this new version before the current version is shut down.

More information:

pont-us commented 3 weeks ago

The main difficulty is the change in ERA5. Previously any ERA5 request to the cdsapi backend would produce a single NetCDF file. Now, some requests produce a Zip containing multiple NetCDFs. ECMWF tech support clarified this behaviour as follows:

…whenever there a GRIB to NetCDF incompatibility is detected, the output NetCDF file will be split and delivered as a compressed zipped file (with the .zip extension). An example of incompatibility is requesting atmospheric and oceanic variables at the same time.

Here's a fairly minimal example demonstrating this behaviour with the new API:

import cdsapi

dataset = "reanalysis-era5-single-levels-monthly-means"
request = {
    'product_type': ['monthly_averaged_reanalysis'],
    'variable': ['2m_temperature', 'mean_wave_direction'],
    'year': ['2015'],
    'month': ['10'],
    'time': ['00:00'],
    'data_format': 'netcdf',
    'area': [1, -1, -1, 1]
}

client = cdsapi.Client()
client.retrieve(dataset, request).download()

This produces a Zip file containing two NetCDFs, each containing one of the requested variables. The NetCDFs have different resolutions, but this isn't always the case: sometimes a Zip is produced containing multiple NetCDFs with identical dimensions. The equivalent code for the old API is:

import cdsapi

dataset = "reanalysis-era5-single-levels-monthly-means"
request = {
    'product_type': ['monthly_averaged_reanalysis'],
    'variable': ['2m_temperature', 'mean_wave_direction'],
    'year': ['2015'],
    'month': ['10'],
    'time': ['00:00'],
    'data_format': 'netcdf',
    'area': [1, -1, -1, 1]
}

client = cdsapi.Client()
client.retrieve(dataset, request).download()

This produces a single NetCDF on a common grid.

pont-us commented 3 weeks ago

Sketch of a solution for ERA5: