src-d / kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA
Other
797 stars 146 forks source link

internal bug inside kmeans_init_centroids: dist_sum is NaN #36

Open zhoujz10 opened 6 years ago

zhoujz10 commented 6 years ago

Hello! Thanks for your code, which is really helpful.

I have encountered a problem. I can't pass line 296 in kmcuda.cc 'assert(dist_sum == dist_sum); ' because 'dist_sum is NaN'. I don't know why.

My sample_size is 1.5M and my feature_size is 8, when I set the cluster_size to 1000, it works fine. but when I set the cluster_size to 1500 or 2000, it would fail.

Is it because of this? Lloyd is tolerant to samples with NaN features while Yinyang is not. It may happen that some of the resulting clusters contain zero elements. In such cases, their features are set to NaN.

My GPU is TESLA P100 and the memory is enough.

Thanks for your attention!