Closed bolliger32 closed 3 years ago
This is awesome! Thanks @bolliger32 . Dug around a bit and it seems da.coords._data
is a pointer to the original array, and that was getting shipped back to the notebook. dict(da.coords) solves the problem! Thanks for the tip!
this would be cool to combine with the new xarray combine functions so you could combine based on coords or auto-combine. or just drop the dataarray and dataset from delayed functions and just provide dataarrays and datasets functions and point the users to these concat functions.
Workflow would just be:
futures = [ ... ] # flat list of dataarray futures with arbitrary non-overlapping coordinate relationships
da = xr.combine_by_coords(rhgx.dataarrays_from_delayed(futures))
futures = [[...], [...], ...] # nested list of datarrays with hierarchical structures
da = xr.combine_nested(rhgx.dataarrays_from_delayed(futures))
# or even, if you want terrible performance and just don't care...
futures = [ ... ] # ordered flat list of dataarray futures with overlapping coordinate relationships
da = functools.reduce(lambda x, y: x.combine_first(y), rhgx.dataarrays_from_delayed(futures))
This is awesome! Thanks @bolliger32 . Dug around a bit and it seems
da.coords._data
is a pointer to the original array, and that was getting shipped back to the notebook. dict(da.coords) solves the problem! Thanks for the tip!
@delgadom nice find! that was blowing my mind.
this would be cool to combine with the new xarray combine functions so you could combine based on coords or auto-combine. or just drop the dataarray and dataset from delayed functions and just provide dataarrays and datasets functions and point the users to these concat functions.
agreed. Maybe I'll create an issue just to have this on the backburner. If people start using these functions more frequently we can expose those functions in a user-friendly way.
flake8 rhg_compute_tools tests docs
(no new errors/warnings added)Gather
dict(ds.coords)
instead ofds.coords
(see #83)