src-d / kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA
Other
806 stars 146 forks source link

OOM on the module #21

Closed lanking520 closed 7 years ago

lanking520 commented 7 years ago

Hi! I tried to use this on my model and it just explode the memory, the input size is: 2M * 6000 Matrix

reassignments threshold: 14050
/tmp/kmcuda/src/kmcuda.cc:152 -> out of memory
failed to allocate 8430000000 bytes for *dest

Traceback (most recent call last):
  File "KMCUDA_MSDSWHReduce.py", line 66, in <module>
    cen, assign = kmtrain(X, num_clusters)
  File "KMCUDA_MSDSWHReduce.py", line 52, in kmtrain
    centroids, assignments = kmeans_cuda(X, num_clusters, verbosity=1, yinyang_t=0, seed=3)
MemoryError: Failed to allocate memory on GPU

I set the yinyang_t = 0

The GPU I use is Tesla P100

vmarkovtsev commented 7 years ago

P100 has 16 gigs. The input and the output (*dest) are > 8430000000 bytes * 2 > 16 gigs. Sorry :( Consider subsampling the dataset.

lanking520 commented 7 years ago

I see. Will two GPU helps on this Algorithm?

I also tried with a smaller dataset: with 440k * 6000 and that one is 14 times faster than my SK-Learn one. Awesome performance!

pandasMX commented 6 years ago

Does it support batch fitting if the input is very large? @vmarkovtsev

vmarkovtsev commented 6 years ago

It doesn't. The intended use case is having a big ratio of classes to samples.