Open WardBrian opened 3 years ago
@WardBrian It's not hdf5 but hdf4:
$ hdfls groundbased_lidar.aerosol_nasa.jpl002_glass.1.1_mauna.loa.hi_20040109t045500z_20040109t065531z_001.hdf: groundbased_lidar.aerosol_nasa.jpl002_glass.1.1_mauna.loa.hi_20040109t045500z_20040109t065531z_001.hdf: File library version: Major= 4, Minor=2, Release=3 String=HDF Version 4.2 Release 3, January 27, 2008
Unfortunately I have no idea how to get this into xarray, though. But good chance that someone knows how to do this.
hdf4
Yes, the only mention I can find of hdf4 for xarray relies on PyNIO, which has been discontinued. If netCDF4 can open the raw file, I'm not sure why xarray can't
I've been able to confirm locally that the problem is caused by the call to filters
here
https://github.com/pydata/xarray/blob/18ed29e4086145c29fde31c9d728a939536911c9/xarray/backends/netCDF4_.py#L395-L399
The line is even commented to say it is netcdf4 specific, but it is called unconditionally. I wrapped it in a try/except and then the file loaded, so I think this may just be an oversight in the backend
@WardBrian Coming back to this now, netCDF4-python can obviously read these kind of HDF4 files. We might think about special casing filters
here to not be called unconditionally. Thoughts?
Seems good to me.
What happened:
I am reading files from https://www-air.larc.nasa.gov/pub/NDACC/PUBLIC/stations/mauna.loa.hi/hdf/lidar/.
When passed to
xr.open_dataset
, the following error occursHowever,
Does not produce any errors
What you expected to happen:
I expect that xarray be able to load the file
Minimal Complete Verifiable Example:
Anything else we need to know?:
Changing the engine to
h5netcdf
produces a different error, but still fails.Setting
decode_cf=False
has no effect.Environment:
Output of xr.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.9.1 | packaged by conda-forge | (default, Jan 26 2021, 01:34:10) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.11.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: None LOCALE: None.None libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.17.0 pandas: 1.2.3 numpy: 1.20.1 scipy: 1.6.2 netCDF4: 1.5.6 pydap: None h5netcdf: 0.10.0 h5py: 3.1.0 Nio: None zarr: None cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.03.1 distributed: 2021.03.1 matplotlib: 3.3.4 cartopy: 0.18.0 seaborn: None numbagg: None pint: 0.17 setuptools: 49.6.0.post20210108 pip: 21.0.1 conda: None pytest: None IPython: 7.22.0 sphinx: 3.5.3