Open scottyhq opened 3 years ago
upstream failture in test_http_read_netcdf_dask
, this line in particular assert isinstance(ds._file_obj, xr.backends.h5netcdf_.H5NetCDFStore)
related to backend refactoring https://github.com/pydata/xarray/pull/4809
We're trying to test that if we specify engine='h5netcdf' in the intake.open_netcdf()
, that backend/engine was in fact used to load the data. @dcherian, @alexamici, or @shoyer , would love your guidance on what attribute to use instead here going forward.
2021-02-03T19:20:13.6750105Z =================================== FAILURES ===================================
2021-02-03T19:20:13.6750570Z __________________________ test_http_read_netcdf_dask __________________________
2021-02-03T19:20:13.6750878Z
2021-02-03T19:20:13.6751908Z data_server = 'http://localhost:8000'
2021-02-03T19:20:13.6752254Z
2021-02-03T19:20:13.6752664Z def test_http_read_netcdf_dask(data_server):
2021-02-03T19:20:13.6753748Z url = f'{data_server}/next_example_1.nc'
2021-02-03T19:20:13.6754292Z source = intake.open_netcdf(url, chunks={},
2021-02-03T19:20:13.6755529Z xarray_kwargs=dict(engine='h5netcdf'))
2021-02-03T19:20:13.6756225Z ds = source.to_dask()
2021-02-03T19:20:13.6757009Z > assert isinstance(ds._file_obj, xr.backends.h5netcdf_.H5NetCDFStore)
2021-02-03T19:20:13.6757631Z
2021-02-03T19:20:13.6759042Z /home/runner/work/intake-xarray/intake-xarray/intake_xarray/tests/test_remote.py:128:
2021-02-03T19:20:13.6759844Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-02-03T19:20:13.6760197Z
2021-02-03T19:20:13.6760924Z self = <xarray.Dataset>
2021-02-03T19:20:13.6761507Z Dimensions: (lat: 5, level: 4, lon: 10, time: 1)
2021-02-03T19:20:13.6762068Z Coordinates:
2021-02-03T19:20:13.6762765Z * lat (lat) int32 20 30 40 50 6...e, lat, lon) float32 dask.array<chunksize=(1, 5, 10), meta=np.ndarray>
2021-02-03T19:20:13.6763432Z Attributes:
2021-02-03T19:20:13.6763967Z source: Fictional Model Output
2021-02-03T19:20:13.6764716Z name = '_file_obj'
2021-02-03T19:20:13.6765102Z
2021-02-03T19:20:13.6765806Z def __getattr__(self, name: str) -> Any:
2021-02-03T19:20:13.6766425Z if name not in {"__dict__", "__setstate__"}:
2021-02-03T19:20:13.6767083Z # this avoids an infinite loop when pickle looks for the
2021-02-03T19:20:13.6767836Z # __setstate__ attribute before the xarray object is initialized
2021-02-03T19:20:13.6768552Z for source in self._attr_sources:
2021-02-03T19:20:13.6769121Z with suppress(KeyError):
2021-02-03T19:20:13.6769666Z return source[name]
2021-02-03T19:20:13.6770211Z > raise AttributeError(
2021-02-03T19:20:13.6770874Z "{!r} object has no attribute {!r}".format(type(self).__name__, name)
2021-02-03T19:20:13.6771421Z )
2021-02-03T19:20:13.6772262Z E AttributeError: 'Dataset' object has no attribute '_file_obj'
I don't think we have a public API for this, unfortunately. This object is now stored at Dataset._close
, which you might be made to reverse engineer to find the H5NetCDFStore class (but it isn't a supported solution).
One option might be to mock h5netcdf.File
, which would let you verify that it is called.
@scottyhq the next release of xarray will honour in all cases the engine
argument it is passed (it was not the case in previous versions but only in unusual cases). If you want to actually test that xarray_kwargs
reach xarray the simplest way is to mock xarray.open_dataset
itself and check the it was called with engine=='h5netcdf'
.
Would that be enough for your use case?
this removes older CI config and adds a scheduled nightly test run, which hopefully will catch issues with new versions of dependency libraries (https://github.com/intake/intake-xarray/issues/98)