Note that the locality-aware scheduling can currently not be tested reliably. This is due to 2 reasons:
1) We cannot guarantee that dask partitions end up living on a specific node (to set an initial state for redistribution)
2) We cannot guarantee that tasks that determine the dask partition node are co-scheduled with the respective partition.
Once the ray.state.objects() API or something similar comes back, the second option should be easy to enable, and the first option will be easy to confirm.
Currently it seems that the problem lies with 1) - ray memory shows that each partition lives on the head node after calling persist(). I'll try to find a solution to that.
Closes #92
Note that the locality-aware scheduling can currently not be tested reliably. This is due to 2 reasons:
1) We cannot guarantee that dask partitions end up living on a specific node (to set an initial state for redistribution) 2) We cannot guarantee that tasks that determine the dask partition node are co-scheduled with the respective partition.
Once the
ray.state.objects()
API or something similar comes back, the second option should be easy to enable, and the first option will be easy to confirm.