leap-stc / eNATL_feedstock

Apache License 2.0
0 stars 1 forks source link

Issue with opening using h5netcf #6

Open SammyAgrawal opened 3 months ago

SammyAgrawal commented 3 months ago

There seems to be an ingestion issue, not with pgf but with the raw file reading engine.

SammyAgrawal commented 3 months ago

Reproducible Code Snippet:

import xarray as xr
import fsspec
url = 'https://zenodo.org/records/10513552/files/eNATL60-BLBT02_y2009m07d01.1d_TSWm_60m.nc'
with fsspec.open(url, mode='rb').open() as file: 
    with open("test_ds1.nc", 'wb') as f:
        f.write(file.read())
    ds_local_h5 = xr.open_dataset("test_ds1.nc", engine="h5netcdf", use_cftime=True, decode_cf=False, decode_times=False) 
    ds_local_cdf4 = xr.open_dataset("test_ds1.nc", engine="netcdf4", use_cftime=True, decode_cf=False, decode_times=False)
    ds_virtual = xr.open_dataset(file, engine='h5netcdf')
ds_virtual.time.shape, ds_local_h5.time.shape, ds_local_cdf4.time.shape

The shapes are 0, 0, and 1. Essentially, anything opened with h5netcdf fails to properly load the time dimension leaving copying to local and opening with netcdf4 as the only viable option. This must somehow be an issue with h5netcdf, not with ffspec or the raw bytes (since the downloaded local file works).

I'm not sure if h5netcdf's inability to read the file is a product of the file or the library, but it would seem the latter if other engines are able to succeed?

SammyAgrawal commented 3 months ago

| OpenWithXarray(copy_to_local=True, xarray_open_kwargs = {'use_cftime':True, 'engine':"netcdf4"})

Confirm that this fixes the time dimension issue.