Hello! Thanks for your code, which is really helpful.
I have encountered a problem. I can't pass line 296 in kmcuda.cc 'assert(dist_sum == dist_sum); ' because 'dist_sum is NaN'. I don't know why.
My sample_size is 1.5M and my feature_size is 8, when I set the cluster_size to 1000, it works fine. but when I set the cluster_size to 1500 or 2000, it would fail.
Is it because of this? Lloyd is tolerant to samples with NaN features while Yinyang is not. It may happen that some of the resulting clusters contain zero elements. In such cases, their features are set to NaN.
Hello! Thanks for your code, which is really helpful.
I have encountered a problem. I can't pass line 296 in kmcuda.cc 'assert(dist_sum == dist_sum); ' because 'dist_sum is NaN'. I don't know why.
My sample_size is 1.5M and my feature_size is 8, when I set the cluster_size to 1000, it works fine. but when I set the cluster_size to 1500 or 2000, it would fail.
Is it because of this? Lloyd is tolerant to samples with NaN features while Yinyang is not. It may happen that some of the resulting clusters contain zero elements. In such cases, their features are set to NaN.
My GPU is TESLA P100 and the memory is enough.
Thanks for your attention!