Open Hoeze opened 4 years ago
Hi @Hoeze, thanks for the question!
There are some things we have in the works for arrays, but we don't have a timeline on it yet. For labeled arrays, we could potentially implement an xarray interface on the existing implementation. It is something interesting to consider.
In my experience xarray is very good and scales quite well, is there a reason you would like to see it in Modin?
I experienced some issues with Dask parallelization as my queries were running effectively single-threaded independent from the scheduler. I wanted to try Ray because of the shared-memory object support. My theory is that Dask has to serialize/deserialize every object just too often in my case.
In the end, I'm searching for a python-based alternative to Spark that supports all kinds of Arrow types and scales well.
@RehanSD
Hi @RichardScottOZ! Is this something you'd be interested in collaborating on?
It would be cool to have Modin for multi-dimensional arrays just like xarray. Would it be possible to implement this feature or is this just too far from being doable?