mars-project / mars

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
https://mars-project.readthedocs.io
Apache License 2.0
2.7k stars 326 forks source link

Refactor of storage service #2352

Open hekaisheng opened 3 years ago

hekaisheng commented 3 years ago

This proposal aims to improve the stability of storage service and clarify the responsibilities of each actors created by storage service.

Actors on main pool

Actors on subpools

Other changes

qinxuye commented 3 years ago

Key part I think is dividing entire module into submodules including spill, transfer etc.

wjsi commented 3 years ago

We also need APIs to fetch data into a priority list of storage levels to avoid redundant RPC calls.