rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.18k stars 527 forks source link

[DOC] cuml integration with sklearn-evaluation for visualization #5123

Open idomic opened 1 year ago

idomic commented 1 year ago

Report incorrect documentation

Allowing users to get out-of-the-box visualizations

Location of incorrect documentation NA

Describe the problems or issues found in the documentation Any plans on integrating with sklearn-evaluation? It has more capabilities (bug fixes + new features) and is well-maintained. Happy to assist!

Steps taken to verify documentation is incorrect List any steps you have taken:

Suggested fix for documentation I'm suggesting creating some getting-started guide or a tutorial so users have a go-to when plotting is needed.

beckernick commented 1 year ago

It looks like many things generally work currently:

from sklearn.datasets import make_classification
from sklearn_evaluation import plot
import cuml

X,y = make_classification(n_samples=1000)

clf = cuml.ensemble.RandomForestClassifier()
clf.fit(X,y)
preds = clf.predict(X)

plot.confusion_matrix(y, preds)
# plot appears
from sklearn_evaluation import plot
from sklearn import cluster

model = cuml.cluster.KMeans(random_state=1)
_ = plot.elbow_curve(X, model, n_clusters=(2, 3, 4, 5, 6, 7, 8))
# plot appears

I suspect the feature importances table for tree models won't work until #3361 is implemented.

Perhaps highlighting this integration would make sense for a blog post. Would you be interested in exploring what currently works?

idomic commented 1 year ago

I suspect the feature importances table for tree models won't work until https://github.com/rapidsai/cuml/issues/3361 is implemented.

I can give it a try, just to be sure.

Perhaps highlighting this integration would make sense for a blog post. Would you be interested in exploring what currently works?

Probably a blog, we can also have a tutorial/quick start examples, I'm open to it. Do you have anything you'd like to prioritize?

viclafargue commented 1 year ago

cc @exactlyallan

idomic commented 1 year ago

@beckernick I've started working on deploying cuml locally, since I have a M2mac I assume it won't work? I did find this colab: rapids-conda-colab-template.ipynb which also seems to crash. What's the easiest way I can setup up my environment to test this PR?

beckernick commented 1 year ago

Yes, that's correct. The easiest way to get up and running if you don't have an NVIDIA GPU locally is to spin up a notebook on Colab, select the GPU runtime, and then pip install cuML.