google-research / weatherbench2

A benchmark for the next generation of data-driven global weather models.
https://weatherbench2.readthedocs.io
Apache License 2.0
416 stars 42 forks source link

ERA5 sea surface temperatures all NaN #161

Open markmbaum opened 4 months ago

markmbaum commented 4 months ago

Some of the era5 sea surface temperatures are all NaN. You can verify by running:

import numpy as np
import gcsfs
import xarray as xr

fs = gcsfs.GCSFileSystem(token="anon")
for path in fs.ls("weatherbench2/datasets/era5"):
    if ".zarr" in path:
        try:
            mapper = fs.get_mapper(path)
            ds = xr.open_zarr(mapper)
        except FileNotFoundError:
            print(f"issue opening {path}")
        else:
            sst_slice = ds["sea_surface_temperature"].isel(time=-1).values
            if np.all(np.isnan(sst_slice)):
                print(f"all NaN SST: {path}")

which prints the following datasets:

all NaN SST: weatherbench2/datasets/era5/1959-2022-1h-240x121_equiangular_with_poles_conservative.zarr
all NaN SST: weatherbench2/datasets/era5/1959-2022-1h-360x181_equiangular_with_poles_conservative.zarr
all NaN SST: weatherbench2/datasets/era5/1959-2022-6h-128x64_equiangular_conservative.zarr
all NaN SST: weatherbench2/datasets/era5/1959-2022-6h-64x32_equiangular_conservative.zarr
all NaN SST: weatherbench2/datasets/era5/1959-2022-6h-64x33.zarr

It doesn't seem to matter which time slice is checked.

This might be related to #124.