rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.21k stars 530 forks source link

[FEA] Simplify process to train cuml KMeans on GPU and save the model and later load on a CPU machine for inference #3626

Open user06039 opened 3 years ago

user06039 commented 3 years ago

I am trying to use KMeans in CUML for fitting the data, but for inference/prediction I want to do it on CPU? Is it possible somehow? I really need a way to predict on CPU. Please help

EDIT:

I feel like it's useful feature for the community, since training and tuning is more resource taking process using GPU make sense but for inference I feel like a CPU machine should do a decent job in production.

viclafargue commented 3 years ago

This code seems to be working:

import cuml
from sklearn.cluster import KMeans as skKMeans
from cuml.cluster import KMeans as cuKMeans

from sklearn.datasets import make_blobs
from numpy.testing import assert_equal

X, _ = make_blobs(n_samples=1000, n_features=10, centers=8)

cuModel = cuKMeans()
cuModel.fit(X)

skModel = skKMeans()
with cuml.using_output_type("numpy"):
    skModel.labels_ = cuModel.labels_
    skModel.cluster_centers_ = cuModel.cluster_centers_
skModel._n_threads = 1

assert_equal(cuModel.predict(X), skModel.predict(X))

Also see sklearn's Model persistence page

user06039 commented 3 years ago

@viclafargue Thank you, this seems to be really interesting trick, Is there any disadvantage of doing this?

Also, why do we need to set skModel._n_threads = 1 ?

viclafargue commented 3 years ago

I don't see any disadvantage apart from the fact that this method may not work with every estimators. Know that if you're only interested in storing your trained cuML estimator it is possible to persist it with pickling. It will then be redeployed to GPU allowing faster predictions/transformations.

Also, why do we need to set skModel._n_threads = 1 ?

This is something specific to Scikit-Learn's KMean code. It needs to be specified to avoid a crash during prediction. In my understanding, it is used to set the number of threads to be used in OpenMP.

user06039 commented 3 years ago

@viclafargue Thanks for clarifying it. This saved hours of re-training using scikit-learn kmeans implementation. I think there should be a way to do this directly in cuml, since not everyone uses GPU's in their production environment for inference.

Is there a way I could turn this post into a feature request?

dantegd commented 3 years ago

@John-8704 turning it into a feature request would be very welcomedd

user06039 commented 3 years ago

@dantegd I have edited the post, I hope that would suffice. I guess someone should change the labels attached to this post.

github-actions[bot] commented 3 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

JohnZed commented 3 years ago

We will consider this a feature request for simplification of this process in a future release (and documenting better). Thank you for filing!

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.