/root/miniconda/envs/tortoise/lib/python3.9/site-packages/torch/cuda/__init__.py:141: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 500: named symbol not found (Triggered internally at /opt/conda/conda-bld/pytorch_1711403380164/work/c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
False
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/root/miniconda/envs/tortoise/lib/python3.9/site-packages/torch/cuda/__init__.py", line 302, in _lazy_init
torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 500: named symbol not found
I'm not sure how to even troubleshoot this error.
(tortoise) root@9454180e9c47:/app# nvidia-smi
Tue May 28 20:50:37 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.03 Driver Version: 555.85 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 On | 00000000:2A:00.0 On | N/A |
| 0% 44C P5 18W / 170W | 1644MiB / 12288MiB | 6% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 36 G /Xwayland N/A |
| 0 N/A N/A 48 G /Xwayland N/A |
+-----------------------------------------------------------------------------------------+
Docker File
Build command:
docker build . -t tts
Run Command:(base) root@9454180e9c47:/# cd app
(base) root@9454180e9c47:/app# conda activate tortoise
(tortoise) root@9454180e9c47:/app# python -c "import torch; print(torch.cuda.is_available());torch.zeros(1).cuda()"
I'm not sure how to even troubleshoot this error.
(tortoise) root@9454180e9c47:/app# nvidia-smi