PyPSA / atlite

Atlite: A Lightweight Python Package for Calculating Renewable Power Potentials and Time Series
https://atlite.readthedocs.io
255 stars 87 forks source link

Combine ERA5 and ERA5T data #261

Closed zoltanmaric closed 1 year ago

zoltanmaric commented 1 year ago

Closes #190.

Change proposed in this Pull Request

Description

Requesting cutout data spanning recent (ERA5T) and data older than ~3 months (ERA5) results in an additional dimension in cutout.data, called expver, which atlite currently cannot handle gracefully.

This change collapses the two dimensions into a single dimension.

Motivation and Context

See details in #190.

How Has This Been Tested?

The first commit of this pull request introduces a failing test that demonstrates the bug. The second commit fixes the bug and makes the failing test pass.

~IMPORTANT: the test introduced in the first commit should not be merged into master. The test data is time-dependent. Data which is ERA5T today will become ERA5 in the future. I only added the test for illustration. See the discussion in this pull request for alternative test proposals.~ (This has been resolved)

Type of change

Checklist

zoltanmaric commented 1 year ago

IMPORTANT: the test introduced in the first commit should not be merged into master. The test data is time-dependent. Data which is ERA5T today will become ERA5 in the future. I only added the test for illustration. See the discussion in this pull request for alternative test proposals.

@FabianHofmann This fix is addressing a "moving target" bug. The test case in the first commit spans September and October 2022, where the CDSAPI currently returns ERA5 data for September, and ERA5T data for October, but this will shift in December (i.e. the ERA5T data from October will become ERA5 data, source).

In order to make this a stable test - we could keep the downloaded response data from CDSAPI in a file, and have the retrieve_data function load the request from file if the file is available. This would require another code change though. I made a dirty version of this solution in https://github.com/PyPSA/atlite/compare/master...zoltanmaric:atlite:reuse-downloaded-era5-files

I think this loading from file would be a useful feature anyway to prevent having to redownload large requests from the CDSAPI when debugging the building of cutouts.

FabianHofmann commented 1 year ago

Thanks @zoltanmaric, always a pleasure to see contributions from you (sorry I am not so responsive at the atlite repo lately). This looks very sensible.

FabianHofmann commented 1 year ago

@zoltanmaric sorry again that I am so slow here. Looks awesome!