src-d / kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA
Other
783 stars 144 forks source link

ValueError: Invalid arguments were passed to kmeans_cuda for metric="cos" #88

Open mahfuzmohammad opened 4 years ago

mahfuzmohammad commented 4 years ago

Hi, I am trying to run the following example:

import numpy from matplotlib import pyplot from libKMCUDA import kmeans_cuda

numpy.random.seed(0) arr = numpy.empty((10000, 2), dtype=numpy.float32) angs = numpy.random.rand(10000) 2 numpy.pi for i in range(10000): arr[i] = numpy.sin(angs[i]), numpy.cos(angs[i]) centroids, assignments, avg_distance = kmeans_cuda( arr, 4, metric="cos", verbosity=1, seed=3, average_distance=True) print("Average distance between centroids and members:", avg_distance) print(centroids) pyplot.scatter(arr[:, 0], arr[:, 1], c=assignments) pyplot.scatter(centroids[:, 0], centroids[:, 1], c="white", s=150)

But I am getting the error: ValueError: Invalid arguments were passed to kmeans_cuda

randomrain101 commented 4 years ago

Apparently the cos metric only works for np.float16 dtype, not np.float32? (notice: when using np.float16 the number of features has to be even, as discussed in #86 )

davidlainesv commented 1 year ago

In my case, I was able to use metric="cos" after doing

import numpy as np from sklearn.preprocessing import normalize

X = normalize(data.values, norm='l2')

I suggest you to run the script in the terminal because jupyter hides part of the error message