RhodiumGroup / rhg_compute_tools

Tools for using compute.rhg.com and compute.impactlab.org
MIT License
1 stars 4 forks source link

add alternative combining methods to rct.xarray.*_from_delayed #85

Open bolliger32 opened 4 years ago

bolliger32 commented 4 years ago

Paraphrased from @delgadom in #84:

It would be cool to combine our *_from_delayed rhg_compute_tools.xarray functions with the new xarray combine functions so you could combine based on coords or auto-combine. or just drop the dataarray and dataset from delayed functions and just provide dataarrays and datasets functions and point the users to these concat functions.

Workflow would just be:

futures = [ ... ] # flat list of dataarray futures with arbitrary non-overlapping coordinate relationships da = xr.combine_by_coords(rhgx.dataarrays_from_delayed(futures))

futures = [[...], [...], ...] # nested list of datarrays with hierarchical structures da = xr.combine_nested(rhgx.dataarrays_from_delayed(futures))

or even, if you want terrible performance and just don't care...

futures = [ ... ] # ordered flat list of dataarray futures with overlapping coordinate relationships da = functools.reduce(lambda x, y: x.combine_first(y), rhgx.dataarrays_from_delayed(futures))

delgadom commented 4 years ago

Yeah, or just get people to use native xarray and drop the concat functions. Turning futures into lists of dask arrays is already a huge help and then getting people to combine these with xr concat tools is probably the best?

bolliger32 commented 4 years ago

yeah good point so maybe all that's needed is to remove the function that uses concat and add some hints/examples in the docstring of the functions that return a list of dask objects.