src-d / kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA
Other
791 stars 145 forks source link

test.py fails #5

Closed sharthZ23 closed 7 years ago

sharthZ23 commented 7 years ago

Running python3 test.py gives that traceback:

E
E
Earguments: 1 0.010 0.10 1 1000 256 10 3 0 0 3 0x7f50873bd010 0x3305110 0x3307920
reassignments threshold: 10
yinyang groups: 1
[0] device_samples: 0x4214380000 - 0x421447a000 (1024000)
[1] device_samples: 0x4214480000 - 0x421457a000 (1024000)
[2] device_samples: 0x4214580000 - 0x421467a000 (1024000)
[3] device_samples: 0x4214680000 - 0x421477a000 (1024000)
[0] device_centroids: 0x4214780000 - 0x4214782800 (10240)
[1] device_centroids: 0x4214880000 - 0x4214882800 (10240)
[2] device_centroids: 0x4214980000 - 0x4214982800 (10240)
[3] device_centroids: 0x4214a80000 - 0x4214a82800 (10240)
[0] device_assignments: 0x4214b80000 - 0x4214b80fa0 (4000)
[1] device_assignments: 0x4214c80000 - 0x4214c80fa0 (4000)
[2] device_assignments: 0x4214d80000 - 0x4214d80fa0 (4000)
[3] device_assignments: 0x4214e80000 - 0x4214e80fa0 (4000)
[0] device_assignments_prev: 0x4214b81000 - 0x4214b81fa0 (4000)
[1] device_assignments_prev: 0x4214c81000 - 0x4214c81fa0 (4000)
[2] device_assignments_prev: 0x4214d81000 - 0x4214d81fa0 (4000)
[3] device_assignments_prev: 0x4214e81000 - 0x4214e81fa0 (4000)
[0] device_ccounts: 0x4214f80000 - 0x4214f80028 (40)
[1] device_ccounts: 0x4215080000 - 0x4215080028 (40)
[2] device_ccounts: 0x4215180000 - 0x4215180028 (40)
[3] device_ccounts: 0x4215280000 - 0x4215280028 (40)
[0] device_assignments_yy: 0x4214f80200 - 0x4214f80228 (40)
[1] device_assignments_yy: 0x4215080200 - 0x4215080228 (40)
[2] device_assignments_yy: 0x4215180200 - 0x4215180228 (40)
[3] device_assignments_yy: 0x4215280200 - 0x4215280228 (40)
[0] device_bounds_yy: 0x4214b82000 - 0x4214b827d0 (2000)
[1] device_bounds_yy: 0x4214c82000 - 0x4214c827d0 (2000)
[2] device_bounds_yy: 0x4214d82000 - 0x4214d827d0 (2000)
[3] device_bounds_yy: 0x4214e82000 - 0x4214e827d0 (2000)
[0] device_drifts_yy: 0x4214782800 - 0x4214785028 (10280)
[1] device_drifts_yy: 0x4214882800 - 0x4214885028 (10280)
[2] device_drifts_yy: 0x4214982800 - 0x4214985028 (10280)
[3] device_drifts_yy: 0x4214a82800 - 0x4214a85028 (10280)
[0] device_passed_yy: 0x4214f80400 - 0x4214f807e8 (1000)
[1] device_passed_yy: 0x4215080400 - 0x42150807e8 (1000)
[2] device_passed_yy: 0x4215180400 - 0x42151807e8 (1000)
[3] device_passed_yy: 0x4215280400 - 0x42152807e8 (1000)
[0] device_centroids_yy: 0x4214f80800 - 0x4214f80c00 (1024)
[1] device_centroids_yy: 0x4215080800 - 0x4215080c00 (1024)
[2] device_centroids_yy: 0x4215180800 - 0x4215180c00 (1024)
[3] device_centroids_yy: 0x4215280800 - 0x4215280c00 (1024)
GPU #0 memory: used 281280512 bytes (2.3%), free 11714297856 bytes, total 11995578368 bytes
GPU #1 memory: used 70647808 bytes (0.6%), free 11924930560 bytes, total 11995578368 bytes
GPU #2 memory: used 70647808 bytes (0.6%), free 11924930560 bytes, total 11995578368 bytes
GPU #3 memory: used 72744960 bytes (0.6%), free 11922833408 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/mgr/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 1 0.010 0.10 1 10000 2 4 3 1 0 2 0x44ca270 0x44c84d0 0x44f1390
reassignments threshold: 100
yinyang groups: 0
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 1 0.010 0.10 1 10000 1 4 3 1 1 2 0x44ca010 0x2f5b7b0 0x44e74f0
reassignments threshold: 100
yinyang groups: 0
GPU #0 memory: used 279183360 bytes (2.3%), free 11716395008 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 1 0.050 0.00 0 13000 1 50 3 1 1 2 0x44ca010 0x32e77c0 0x44d6b40
reassignments threshold: 650
yinyang groups: 0
GPU #0 memory: used 279183360 bytes (2.3%), free 11716395008 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_
EE
E
E
E
E
E
E
Esamples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 1 1.000 0.00 0 13000 2 50 3 1 0 2 0x32e78b0 0x3305df0 0x44ca010
reassignments threshold: 13000
yinyang groups: 0
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 1 0.010 0.10 0 13000 1 50 3 1 1 2 0x44ca010 0x32e77c0 0x44d6b40
reassignments threshold: 130
yinyang groups: 5
reusing passed_yy for centroids_yy
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 0 0.050 0.00 0 13000 1 50 3 1 1 2 0x44ca010 0x32e77c0 0x44d6b40
reassignments threshold: 650
yinyang groups: 0
GPU #0 memory: used 279183360 bytes (2.3%), free 11716395008 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 0 0.250 0.00 0 13000 2 50 3 1 0 2 0x32e78b0 0x33078d0 0x44ca010
reassignments threshold: 3250
yinyang groups: 0
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 1 0.050 0.00 0 13000 2 50 3 1 0 2 0x32e78b0 0x44c0c80 0x44ca010
reassignments threshold: 650
yinyang groups: 0
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 1 0.050 0.00 0 13000 2 50 3 3 0 2 0x32e78b0 0x44c0c80 0x44ca010
reassignments threshold: 650
yinyang groups: 0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #1
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
GPU #1 memory: used 69599232 bytes (0.6%), free 11925979136 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 1 0.010 0.10 0 13000 2 50 3 1 0 2 0x32e78b0 0x44c1c20 0x44ca010
reassignments threshold: 130
yinyang groups: 5
reusing passed_yy for centroids_yy
GPU #0 memory: used 281280512 bytes (2.3%), free 11714297856 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 0 0.050 0.00 0 13000 2 50 3 1 0 2 0x32e78b0 0x44c1e00 0x44ca010
reassignments threshold: 650
yinyang groups: 0
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 0 0.050 0.00 0 13000 2 50 3 3 0 2 0x32e78b0 0x44c14b0 0x44ca010
reassignments threshold: 650
yinyang groups: 0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #1
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
GPU #1 memory: used 69599232 bytes (0.6%), free 11925979136 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
km
E.
E
E
E
Eeans_cuda_setup failed: invalid device symbol
arguments: 0 0.050 0.00 0 13000 2 50 3 65535 0 2 0x32e78b0 0x3306680 0x44ca010
arguments: 0 0.050 0.00 0 13000 2 50 3 0 0 2 0x32e78b0 0x44c97e0 0x44ca010
reassignments threshold: 650
yinyang groups: 0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #1
p2p is already enabled on gpu #1
p2p is already enabled on gpu #1
p2p is already enabled on gpu #2
p2p is already enabled on gpu #2
p2p is already enabled on gpu #2
p2p is already enabled on gpu #3
p2p is already enabled on gpu #3
p2p is already enabled on gpu #3
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
GPU #1 memory: used 69599232 bytes (0.6%), free 11925979136 bytes, total 11995578368 bytes
GPU #2 memory: used 69599232 bytes (0.6%), free 11925979136 bytes, total 11995578368 bytes
GPU #3 memory: used 71696384 bytes (0.6%), free 11923881984 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 0 0.050 0.00 0 13000 2 50 3 2 0 2 0x4228700000 0x4228800000 0x4228900000
reassignments threshold: 650
yinyang groups: 0
p2p is already enabled on gpu #1
p2p is already enabled on gpu #0
GPU #1 memory: used 136052736 bytes (1.1%), free 11859525632 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 0 0.050 0.00 0 13000 2 50 3 0 0 2 0x32e78b0 0x569ac60 0x45168d0
reassignments threshold: 650
yinyang groups: 0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #1
p2p is already enabled on gpu #1
p2p is already enabled on gpu #1
p2p is already enabled on gpu #2
p2p is already enabled on gpu #2
p2p is already enabled on gpu #2
p2p is already enabled on gpu #3
p2p is already enabled on gpu #3
p2p is already enabled on gpu #3
GPU #0 memory: used 280231936 bytes (2.3%), free 11715346432 bytes, total 11995578368 bytes
GPU #1 memory: used 71696384 bytes (0.6%), free 11923881984 bytes, total 11995578368 bytes
GPU #2 memory: used 71696384 bytes (0.6%), free 11923881984 bytes, total 11995578368 bytes
GPU #3 memory: used 71696384 bytes (0.6%), free 11923881984 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 0 0.050 0.00 0 13000 2 50 3 1 0 2 0x4228700000 0x4228800200 0x422890cc00
reassignments threshold: 650
yinyang groups: 0
GPU #0 memory: used 344588288 bytes (2.9%), free 11650990080 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol
arguments: 0 0.050 0.00 0 13000 2 50 3 0 0 2 0x4228700000 0x4228800400 0x4228919800
reassignments threshold: 650
yinyang groups: 0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #1
p2p is already enabled on gpu #1
p2p is already enabled on gpu #1
p2p is already enabled on gpu #1
p2p is already enabled on gpu #2
p2p is already enabled on gpu #2
p2p is already enabled on gpu #2
p2p is already enabled on gpu #2
p2p is already enabled on gpu #3
p2p is already enabled on gpu #3
p2p is already enabled on gpu #3
p2p is already enabled on gpu #3
p2p is already enabled on gpu #0
p2p is already enabled on gpu #0
p2p is already enabled on gpu #0
GPU #0 memory: used 344588288 bytes (2.9%), free 11650990080 bytes, total 11995578368 bytes
GPU #1 memory: used 136052736 bytes (1.1%), free 11859525632 bytes, total 11995578368 bytes
GPU #2 memory: used 13605273
E
======================================================================
ERROR: test_256_features (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 369, in test_256_features
    yinyang_t=0.1, seed=3)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_cosine_metric (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 347, in test_cosine_metric
    seed=3)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_fp16_cosine_metric (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 438, in test_fp16_cosine_metric
    seed=3)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_fp16_kmeanspp_lloyd (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 397, in test_fp16_kmeanspp_lloyd
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_fp16_kmeanspp_validate (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 407, in test_fp16_kmeanspp_validate
    verbosity=2, seed=3, tolerance=1.0, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_fp16_kmeanspp_yinyang (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 422, in test_fp16_kmeanspp_yinyang
    verbosity=2, seed=3, tolerance=0.01, yinyang_t=0.1)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_fp16_random_lloyd (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 379, in test_fp16_random_lloyd
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_import_lloyd (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 211, in test_import_lloyd
    verbosity=2, seed=3, tolerance=0.25, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_kmeanspp_lloyd (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 195, in test_kmeanspp_lloyd
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_kmeanspp_lloyd_2gpus (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 233, in test_kmeanspp_lloyd_2gpus
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_kmeanspp_yinyang (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 203, in test_kmeanspp_yinyang
    verbosity=2, seed=3, tolerance=0.01, yinyang_t=0.1)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_random_lloyd (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 182, in test_random_lloyd
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_random_lloyd_2gpus (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 223, in test_random_lloyd_2gpus
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_random_lloyd_all_gpus (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 249, in test_random_lloyd_all_gpus
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_random_lloyd_different_device_ptr (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 322, in test_random_lloyd_different_device_ptr
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_random_lloyd_host_ptr (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 261, in test_random_lloyd_host_ptr
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_random_lloyd_same_device_ptr (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 274, in test_random_lloyd_same_device_ptr
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

======================================================================
ERROR: test_random_lloyd_same_device_ptr_all_devs (__main__.KMCUDATests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 298, in test_random_lloyd_same_device_ptr_all_devs
    verbosity=2, seed=3, tolerance=0.05, yinyang_t=0)
RuntimeError: cudaMemcpy failed

----------------------------------------------------------------------
Ran 19 tests in 8.968s

FAILED (errors=18)
6 bytes (1.1%), free 11859525632 bytes, total 11995578368 bytes
GPU #3 memory: used 136052736 bytes (1.1%), free 11859525632 bytes, total 11995578368 bytes
cudaMemcpyToSymbol(d_samples_size, &h_samples_size, sizeof(h_samples_size))
/home/blabla/projects/kmcuda/kernel.cu:640 -> invalid device symbol
kmeans_cuda_setup failed: invalid device symbol

Hardware: Tesla K80, CUDA 8.0, python 3, ubuntu 14.04

vmarkovtsev commented 7 years ago

Thanks for reporting! The problem is that by default kmcuda is compiled for arch 60 while Tesla K-80 is arch 37. I should add the check for the arch match.

Regarding your problem, you should recompile kmcuda for arch 37 (see the readme).