NVIDIA / libnvidia-container

NVIDIA container runtime library
Apache License 2.0
815 stars 199 forks source link

cudaGetDeviceCount() call failed with error 804 #131

Closed gemfield closed 3 years ago

gemfield commented 3 years ago

The program is compiled and run in Docker container (launched with nvidia official image: nvidia/cuda:11.2.2-cudnn8-devel-ubuntu20.04).

Hardward & driver: RTX2080ti

root@gemfield:~# nvidia-smi
Thu Apr  1 02:03:39 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.102.04   Driver Version: 450.102.04   CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:19:00.0 Off |                  N/A |
| 27%   37C    P8     7W / 250W |      6MiB / 11019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

code:

root@gemfield:~# cat gemfield.cpp 
#include <stdio.h>
#include <cuda_runtime.h>
int main() {
  int device = 0;
  int gpuDeviceCount = 0;
  struct cudaDeviceProp properties;

  cudaError_t cudaResultCode = cudaGetDeviceCount(&gpuDeviceCount);
  printf("\t error: %d\n",cudaResultCode);
}

compile and run:

root@gemfield:~# g++ -I/usr/local/cuda-11.2/targets/x86_64-linux/include/ gemfield.cpp -o gemfield -L/usr/local/cuda-11.2/targets/x86_64-linux/lib/ -lcudart
root@gemfield:~# ./gemfield 
     error: 804

Why got 804 error? Thanks.

gemfield commented 3 years ago

Root cause have been found. For anyone encounter this issue, you can have a look at https://zhuanlan.zhihu.com/p/361545761