rapidsai / deployment

RAPIDS Deployment Documentation
https://docs.rapids.ai/deployment/stable/
9 stars 28 forks source link

RAPIDS 24.06 Databricks Deployment Docs Update #373

Closed jarmak-nv closed 4 months ago

jarmak-nv commented 4 months ago

cuML now uses sklearn 1.5 with the merge of https://github.com/rapidsai/cuml/pull/5851 which causes databricks to fail since their containers use at newest version 1.3.

We will need to update the docs to add

pip install scikit-learn --upgrade

to init.sh

Otherwise users will see an error similar to below:

    from cuml.common import logger as cuml_logger
  File "/databricks/python/lib/python3.9/site-packages/cuml/__init__.py", line 42, in <module>
    from cuml.explainer.kernel_shap import KernelExplainer
  File "/databricks/python/lib/python3.9/site-packages/cuml/explainer/__init__.py", line 17, in <module>
    from cuml.explainer.kernel_shap import KernelExplainer
  File "kernel_shap.pyx", line 28, in init cuml.explainer.kernel_shap
  File "/databricks/python/lib/python3.9/site-packages/cuml/linear_model/__init__.py", line 18, in <module>
    from cuml.linear_model.elastic_net import ElasticNet
  File "elastic_net.pyx", line 21, in init cuml.linear_model.elastic_net
  File "/databricks/python/lib/python3.9/site-packages/cuml/solvers/__init__.py", line 19, in <module>
    from cuml.solvers.qn import QN
  File "qn.pyx", line 39, in init cuml.solvers.qn
  File "/databricks/python/lib/python3.9/site-packages/cuml/metrics/__init__.py", line 45, in <module>
    from cuml.metrics.hinge_loss import hinge_loss
  File "hinge_loss.pyx", line 20, in init cuml.metrics.hinge_loss
  File "/databricks/python/lib/python3.9/site-packages/cuml/preprocessing/__init__.py", line 23, in <module>
    from cuml._thirdparty.sklearn.preprocessing import (
  File "/databricks/python/lib/python3.9/site-packages/cuml/_thirdparty/sklearn/preprocessing/__init__.py", line 6, in <module>
    from ._data import Binarizer
  File "/databricks/python/lib/python3.9/site-packages/cuml/_thirdparty/sklearn/preprocessing/_data.py", line 48, in <module>
    from sklearn.utils._indexing import resample
ModuleNotFoundError: No module named 'sklearn.utils._indexing'
jacobtomlinson commented 4 months ago

Does cuml set 1.5 as a minimum version?

In init.sh in our docs we have

pip install --extra-index-url=https://pypi.nvidia.com \
    "cudf-cu11" \
    "cuml-cu11" \
    "dask-cudf-cu11" \
    "dask-cuda=={{rapids_version}}"

I would assume installing cuml would bump scikit-learn. Is that not the case?

jarmak-nv commented 4 months ago

Oh interesting - you're right!

scikit-learn isn't a hard-dependency of cuML, but it breaks on import now. Looks like this is actually a cuML issue.

jarmak-nv commented 4 months ago

cuML now has a PR to remove the hard dependency for 24.06.

DataBricks has 1.0.2 installed on live, and 1.3 on the beta container. cuML won't trigger an update on its own, so to ensure DB users get a good experience I think we should do an upgrade as part of init.sh.

That being said, maybe my initial plan of an --upgrade is worse than a pin to the same as in cuML ie: pip install scikit-learn==1.5

jacobtomlinson commented 4 months ago

Ok thanks for confirming. So just to check, you are proposing we add something like the following to our docs

pip install --extra-index-url=https://pypi.nvidia.com \
    "cudf-cu11" \
    "cuml-cu11" \
    "dask-cudf-cu11" \
    "dask-cuda=={{rapids_version}}" \
    "scikit-learn==1.5"
jarmak-nv commented 4 months ago

Yup! I figured this is the best place to do it since we already provide the init.sh and while technically users might have no problems on Databricks with the old version of scikit-learn, it's safest to upgrade it to prevent potential issues with cuML.

taureandyernv commented 4 months ago

@jarmak-nv @jacobtomlinson @aravenel this issue also affects colab. Thanks for sharing Ben!

jacobtomlinson commented 4 months ago

The fix in cuml means this change should no longer be needed.