ajdawson / eofs

EOF analysis in Python
http://ajdawson.github.io/eofs/
GNU General Public License v3.0
199 stars 60 forks source link

Work with dask arrays #107

Closed rabernat closed 5 years ago

rabernat commented 5 years ago

Thanks for providing this amazing package. It is absolutely one of the best and most useful python packages I know of!

Currently eofs only works with numpy arrays. However, its core computational algorithm, svd, is implemented in dask array. http://dask.pydata.org/en/latest/array-api.html

This means that it would theoretically be possible for eofs to leverage dask to do out of core EOFs with minimal refactoring.

Is this on your roadmap? Would be keen to help if you’re interested.

rabernat commented 5 years ago

Some inspiration for why this could be cool:

https://www.youtube.com/watch?v=R5CiXti_MWo

rabernat commented 5 years ago

Hi @ajdawson - any thoughts on this query?

ajdawson commented 5 years ago

This is a good suggestion @rabernat. I have no roadmap for this software, and little time to work on it. That said, I'm happy to support an effort on this front though.

I'm imagining the eofs.standard interface would be extended to support dask as well as numpy in a transparent way, that way the metadata interfaces built on top can make use of either. Is that what you had in mind in terms of basic design? I haven't really done my homework on this though!

navidcy commented 5 years ago

@rabernat I salute this suggestion. I read in the documentation the

The xarray interface works with DataArray objects, which are the data containers used by the xarray package.

and I wrongly assumed that this also implied support of dask. Then I spend an hour trying to make things work with xarray+dask before I noticed this issue :)

ajdawson commented 5 years ago

Done in #109