Closed dantegd closed 18 hours ago
I ran the following small snippet to see things in action, but I'm now puzzled about whether or not cuml was used. Is there an easy way to tell (assume I'm a simple minded user who isn't going to dig into the cuml codebase)?
import cuml.experimental.accel
cuml.experimental.accel.install()
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
X, y = make_blobs()
km = KMeans()
km.fit(X, y)
print(f"{km.cluster_centers_=}")
print(km.score(X, y))
This outputs the following:
Installing cuML Accelerator...
[I] [08:33:03.570100] Non Estimator Function Dispatching disabled...
[I] [08:33:03.605120] Non Estimator Function Dispatching disabled...
[I] [08:33:03.607562] Non Estimator Function Dispatching disabled...
km.cluster_centers_=array([[ 8.51813728, 0.89449653],
[ 5.36304509, -9.09408513],
[-1.06137904, 6.52824416],
[ 7.0920223 , -1.11348216],
[ 6.98095313, -8.23207799],
[ 6.79229768, -9.76694763],
[ 0.20774067, 7.58842924],
[ 6.71965882, 1.64106257]])
-85.08620849985817
I was expecting to see either a log message saying "This was run on the GPU!" (or something similarly positive and simple) or as an alternative something like what I proposed in scikit-image where we issue a DispatchNotification
(via the warning system) that lets people know code was run differently from how it would have been without the dispatching enabled.
The second thing I thought might tell me if it was dispatched was inspecting a fitted attribute, though I guess cuml array works hard to make that hard :-/
In general I think we can fix/change most things here after people start trying it.
These are things I'd fix before:
KMeans(8)
so that we don't skip parameters by accident. Also why does it show up as args
?For me its fine to merge. We can always keep working on things
/merge
PR adds a first version of a command line user experience that covers the following estimators: