intake / intake-xarray

Intake plugin for xarray
https://intake-xarray.readthedocs.io/
BSD 2-Clause "Simplified" License
74 stars 36 forks source link

add nightly test workflow, rm appveyor and travis config #99

Open scottyhq opened 3 years ago

scottyhq commented 3 years ago

this removes older CI config and adds a scheduled nightly test run, which hopefully will catch issues with new versions of dependency libraries (https://github.com/intake/intake-xarray/issues/98)

scottyhq commented 3 years ago

upstream failture in test_http_read_netcdf_dask, this line in particular assert isinstance(ds._file_obj, xr.backends.h5netcdf_.H5NetCDFStore) related to backend refactoring https://github.com/pydata/xarray/pull/4809

We're trying to test that if we specify engine='h5netcdf' in the intake.open_netcdf(), that backend/engine was in fact used to load the data. @dcherian, @alexamici, or @shoyer , would love your guidance on what attribute to use instead here going forward.

2021-02-03T19:20:13.6750105Z =================================== FAILURES ===================================
2021-02-03T19:20:13.6750570Z __________________________ test_http_read_netcdf_dask __________________________
2021-02-03T19:20:13.6750878Z 
2021-02-03T19:20:13.6751908Z data_server = 'http://localhost:8000'
2021-02-03T19:20:13.6752254Z 
2021-02-03T19:20:13.6752664Z     def test_http_read_netcdf_dask(data_server):
2021-02-03T19:20:13.6753748Z         url = f'{data_server}/next_example_1.nc'
2021-02-03T19:20:13.6754292Z         source = intake.open_netcdf(url, chunks={},
2021-02-03T19:20:13.6755529Z                                     xarray_kwargs=dict(engine='h5netcdf'))
2021-02-03T19:20:13.6756225Z         ds = source.to_dask()
2021-02-03T19:20:13.6757009Z >       assert isinstance(ds._file_obj, xr.backends.h5netcdf_.H5NetCDFStore)
2021-02-03T19:20:13.6757631Z 
2021-02-03T19:20:13.6759042Z /home/runner/work/intake-xarray/intake-xarray/intake_xarray/tests/test_remote.py:128: 
2021-02-03T19:20:13.6759844Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2021-02-03T19:20:13.6760197Z 
2021-02-03T19:20:13.6760924Z self = <xarray.Dataset>
2021-02-03T19:20:13.6761507Z Dimensions:  (lat: 5, level: 4, lon: 10, time: 1)
2021-02-03T19:20:13.6762068Z Coordinates:
2021-02-03T19:20:13.6762765Z   * lat      (lat) int32 20 30 40 50 6...e, lat, lon) float32 dask.array<chunksize=(1, 5, 10), meta=np.ndarray>
2021-02-03T19:20:13.6763432Z Attributes:
2021-02-03T19:20:13.6763967Z     source:   Fictional Model Output
2021-02-03T19:20:13.6764716Z name = '_file_obj'
2021-02-03T19:20:13.6765102Z 
2021-02-03T19:20:13.6765806Z     def __getattr__(self, name: str) -> Any:
2021-02-03T19:20:13.6766425Z         if name not in {"__dict__", "__setstate__"}:
2021-02-03T19:20:13.6767083Z             # this avoids an infinite loop when pickle looks for the
2021-02-03T19:20:13.6767836Z             # __setstate__ attribute before the xarray object is initialized
2021-02-03T19:20:13.6768552Z             for source in self._attr_sources:
2021-02-03T19:20:13.6769121Z                 with suppress(KeyError):
2021-02-03T19:20:13.6769666Z                     return source[name]
2021-02-03T19:20:13.6770211Z >       raise AttributeError(
2021-02-03T19:20:13.6770874Z             "{!r} object has no attribute {!r}".format(type(self).__name__, name)
2021-02-03T19:20:13.6771421Z         )
2021-02-03T19:20:13.6772262Z E       AttributeError: 'Dataset' object has no attribute '_file_obj'
shoyer commented 3 years ago

I don't think we have a public API for this, unfortunately. This object is now stored at Dataset._close, which you might be made to reverse engineer to find the H5NetCDFStore class (but it isn't a supported solution).

One option might be to mock h5netcdf.File, which would let you verify that it is called.

alexamici commented 3 years ago

@scottyhq the next release of xarray will honour in all cases the engineargument it is passed (it was not the case in previous versions but only in unusual cases). If you want to actually test that xarray_kwargs reach xarray the simplest way is to mock xarray.open_dataset itself and check the it was called with engine=='h5netcdf'.

Would that be enough for your use case?