compute capability mismatch for device 0: wanted 6.1, have 6.0

dfeddema commented 5 years ago

I am getting this error on rhel 7.5 with Python 3.6.5, CUDA 8.0 (V8.0.61), and gcc 4.9.2. It looks as though I may need CUDA V8.0.60 instead of V8.0.61?

[root@e35559eae255 kmcuda]# python reassignments threshold: 100 compute capability mismatch for device 0: wanted 6.1, have 6.0

you may want to build kmcuda with -DCUDA_ARCH=60 (refer to "Building" in compute capability mismatch for device 1: wanted 6.1, have 6.0 you may want to build kmcuda with -DCUDA_ARCH=60 (refer to "Building" in compute capability mismatch for device 2: wanted 6.1, have 6.0 you may want to build kmcuda with -DCUDA_ARCH=60 (refer to "Building" in Traceback (most recent call last): File "", line 11, in centroids, assignments = kmeans_cuda(arr, 4, verbosity=1, seed=3) ValueError: No such CUDA device exists

[root@e35559eae255 kmcuda]# cat import numpy from matplotlib import pyplot from libKMCUDA import kmeans_cuda

numpy.random.seed(0) arr = numpy.empty((10000, 2), dtype=numpy.float32) arr[:2500] = numpy.random.rand(2500, 2) + [0, 2] arr[2500:5000] = numpy.random.rand(2500, 2) - [0, 2] arr[5000:7500] = numpy.random.rand(2500, 2) + [2, 0] arr[7500:] = numpy.random.rand(2500, 2) - [2, 0] centroids, assignments = kmeans_cuda(arr, 4, verbosity=1, seed=3) print(centroids)

pyplot.scatter(arr[:, 0], arr[:, 1], c=assignments)

pyplot.scatter(centroids[:, 0], centroids[:, 1], c="white", s=150)

I have the following packages installed:

I have the following packages installed:

I have tried the following Make command to build but have not resolved the problem. Suggestions? cmake -DCMAKE_BUILD_TYPE=Release -DCUDA_ARCH=60 . && make and installed with pip install git+

vmarkovtsev commented 5 years ago

Hi @dfeddema CUDA version does not matter in this case. pip install builds the package and has nothing to do with the previous cmake command. There are two ways to build: cmake and package. I see that you need Python, so

CUDA_ARCH=60 pip install git+
vmarkovtsev commented 5 years ago


CUDA_ARCH=60 pip install libKMCUDA
dfeddema commented 5 years ago

Hi @vmarkovtsev. Your suggestions worked! Thank you for making my afternoon!

I see in CMakeList.tst that CUDA_ARCH specifies the architecture (which gpu card) that nvcc will generate code for.

This link helped explain the Nvidia SM flags:

I'm running on a Tesla P100 so I need -arch SM60 ... not SM61 as I had originally because that is the default.

In CMakeLists.txt I see the default setting for CUDA_ARCH: if (NOT DEFINED CUDA_ARCH) set(CUDA_ARCH "61") endif()

Also in CMakeLists.txt I see the nvcc sm flag being set: set(CUDA_NVCC_FLAGS "${CUDA_NVCCFLAGS} -arch sm${CUDA_ARCH} -Xptxas=-v -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES")

Here's the test that ran correctly after your fix.

[root@3cb67f3e222b kmcuda]# python reassignments threshold: 100 transposing the samples... performing kmeans++... done
too few clusters for this yinyang_t => Lloyd iteration 1: 10000 reassignments iteration 2: 0 reassignments [[ 0.49675268 -1.504859 ] [ 0.4968266 2.497115 ] [ 2.4868565 0.49439764] [-1.5026922 0.5023965 ]]

vmarkovtsev commented 5 years ago

Happy that it worked! Sorry for not providing binary packages for all possible configurations, there are 5 active CUDA versions x 5 widespread device archs = 25 variants. One day we will.