rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.21k stars 530 forks source link

[FEA] Convert Dask into an optional dependency #5934

Open beckernick opened 4 months ago

beckernick commented 4 months ago

Dask is the primary runtime through which cuML provides multi-GPU functionality in Python. As a result, it's historically been a required cuML dependency. This is a non-issue for users who want the multi-GPU capabilities in cuML.

However, for users who only want to use cuML for single GPU use cases, this can cause unnecessary dependency alignment challenges in their environments -- because it's common practice to tightly pin Dask versions to avoid breakages in stable releases and platforms. We don't expect this behavior to change.

We should explore converting Dask into an optional dependency only necessary if someone attempts to use the Dask multi-GPU capabilities in cuML.

dantegd commented 4 months ago

@beckernick the code in cuML already has this capability (it enables cuml-cpu), all the way to the import structure. The main challenge is making it super easy to users to install the correct versions of: dask, dask-cuda, raft-dask.

There are a few ways we can tackle this problem that I can think off of the top of my head:

What are your thoughts about that @beckernick ?