src-d / kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA
Other
783 stars 144 forks source link

Building #117

Closed alvarozamora closed 3 years ago

alvarozamora commented 3 years ago

Hi,

I seem to build successfully via cmake -DCMAKE_BUILD_TYPE=Release -DCUDA_ARCH=70 -DDISABLE_R=y . && makebut I've run into two issues that suggest otherwise. Firstly, when I run tests.py, It seems to get stuck after what I think is the first test:

>>> python test.py

arguments: 1 0x7ffdac27d940 0.010 0.10 1 1000 256 10 3 0 0 3 0x55d0af0577d0 0x55d0aed8f320 0x55d0aed91b30 (nil)
reassignments threshold: 10
yinyang groups: 1
[0] *dest: 0x7f55d5c00000 - 0x7f55d5cfa000 (1024000)
[0] device_centroids: 0x7f55d5cfa000 - 0x7f55d5cfc800 (10240)
[0] device_assignments: 0x7f55d5cfc800 - 0x7f55d5cfd7a0 (4000)
[0] device_assignments_prev: 0x7f55d5cfd800 - 0x7f55d5cfe7a0 (4000)
[0] device_ccounts: 0x7f55d5cfe800 - 0x7f55d5cfe828 (40)
[0] device_assignments_yy: 0x7f55d5cfea00 - 0x7f55d5cfea28 (40)
[0] device_bounds_yy: 0x7f55d5cfec00 - 0x7f55d5d00b40 (8000)
[0] device_drifts_yy: 0x7f55d5d00c00 - 0x7f55d5d03428 (10280)
[0] device_passed_yy: 0x7f55d5d03600 - 0x7f55d5d045a0 (4000)
reusing passed_yy for centroids_yy
GPU #0 memory: used 185466880 bytes (1.6%), free 11369250816 bytes, total 11554717696 bytes
GPU #0 has 49152 bytes of shared memory per block
transposing the samples...
transpose <<<(32, 8), (8, 32)>>> 1000, 256, xyswap
performing kmeans++...
kmeans++: dump 1000 256 0x55d0af994d70
kmeans++: dev #0: 0x7f55d5c00000 0x7f55d5cfa000 0x7f55d5cfc800
step 1[0] dev_dists: 0x7f55d5d04600 - 0x7f55d5d04640 (64)
step 2[0] dev_dists: 0x7f55d5d04600 - 0x7f55d5d04640 (64)
step 3[0] dev_dists: 0x7f55d5d04600 - 0x7f55d5d04640 (64)
step 4[0] dev_dists: 0x7f55d5d04600 - 0x7f55d5d04640 (64)
step 5[0] dev_dists: 0x7f55d5d04600 - 0x7f55d5d04640 (64)
step 6[0] dev_dists: 0x7f55d5d04600 - 0x7f55d5d04640 (64)
step 7[0] dev_dists: 0x7f55d5d04600 - 0x7f55d5d04640 (64)
step 8[0] dev_dists: 0x7f55d5d04600 - 0x7f55d5d04640 (64)
step 9[0] dev_dists: 0x7f55d5d04600 - 0x7f55d5d04640 (64)
done
running Lloyd until reassignments drop below 110
plans: [(0, 1000)]
planc: [(0, 10)]
iteration 1: 1000 reassignments
iteration 2: 142 reassignments
iteration 3: 40 reassignments
transposing the samples...
transpose <<<(8, 1), (32, 8)>>> 10, 256
performing kmeans++...
kmeans++: dump 10 256 0x55d0af8d43f0
kmeans++: dev #0: 0x7f55d5cfa000 0x7f55d5d03600 0x7f55d5d04574
done
plans: [(0, 10)]
planc: [(0, 1)]
iteration 1: 10 reassignments
iteration 2: 0 reassignments
transposing the samples...
transpose <<<(8, 1), (8, 32)>>> 256, 10, xyswap
plans: [(0, 1000)]
planc: [(0, 10)]
plang: [(0, 1)]
refreshing Yinyang bounds...
iteration 4: 27 reassignments
passed number: 1000
refreshing Yinyang bounds...
iteration 5: 11 reassignments
passed number: 1000
refreshing Yinyang bounds...
iteration 6: 11 reassignments
passed number: 1000
refreshing Yinyang bounds...
iteration 7: 7 reassignments
return kmcudaSuccess

.

Secondly, when I run the kNN python test on the main GitHub page, I get null results:

>>> python kNNtest.py

reassignments threshold: 100
transposing the samples...
performing kmeans++...
done
too few clusters for this yinyang_t => Lloyd
iteration 1: 10000 reassignments
iteration 2: 926 reassignments
iteration 3: 416 reassignments
iteration 4: 187 reassignments
iteration 5: 87 reassignments
transposing the samples...
initializing the inverse assignments...
calculating the cluster radiuses...
calculating the centroid distance matrix...
searching for the nearest neighbors...
calculated 0.000000 of all the distances
[0 0 0 0 0 0 0 0 0 0]

Any suggestions?