[BUG] Received unsupported input type: <class 'dask_cudf.core.DataFrame'> when doing cuml_model.predict(X_test_dask) on Google Colab #3050

Open BioGeek opened 3 years ago

BioGeek commented 3 years ago

Describe the bug I used the rapidsai-csp-utils to install RAPIDS and related libraries on Google Colab. I also installed dask-cuda via conda. Then I copy/pasted and executed the code from the Random Forests Multi-node, Multi-GPU demo notebook.

In the last cell, I get the error:

TypeError                                 Traceback (most recent call last)
<ipython-input-16-79cdc37bea3f> in <module>()
      1 skl_y_pred = skl_model.predict(X_test)
----> 2 cuml_y_pred = cuml_model.predict(X_test_dask).compute().to_array()
      4 # Due to randomness in the algorithm, you may see slight variation in accuracies
      5 print("SKLearn accuracy:  ", accuracy_score(y_test, skl_y_pred))

12 frames
cuml/ensemble/randomforestclassifier.pyx in cuml.ensemble.randomforestclassifier.RandomForestClassifier._predict_get_all()

/usr/local/lib/python3.6/site-packages/cuml/common/input_utils.py in convert_dtype()
    414     else:
--> 415         raise TypeError("Received unsupported input type: %s" % type(X))
    417     return X

TypeError: Received unsupported input type: <class 'dask_cudf.core.DataFrame'>

Steps/Code to reproduce bug

Execute the cells in this Google Colab notebook. The error happens in cell 16.

Expected behavior

cuml_model predicts the labels for X_test_dask.

Environment details (please complete the following information):

viclafargue commented 3 years ago

Thank you for opening the issue. It seems like you are running an up to date (0.17) notebook with RAPIDS softwares in version 0.14. This can unfortunately fail as API specifications change with different versions.

viclafargue commented 3 years ago

Maybe @Salonijain27 has more information on this?

