[Bug/Model Request]: On Windows CUDA not working now

dibu28 commented 1 month ago

What happened?

On Windows11 it was working using CUDA Provider but stopped working using CUDA and fallback to CPU. How to fix it?

here is the log of errors:

2024-09-16 17:25:54.9883111 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:965 onnxruntime::python::CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9. and CUDA 12., and the latest MSVC runtime. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported. Fetching 5 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<?, ?it/s] 2024-09-16 17:26:00.0995992 [E:onnxruntime:Default, provider_bridge_ort.cc:1992 onnxruntime::TryGetProviderInfo_CUDA] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1637 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "C:\Users\dibu2\miniconda3\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

2024-09-16 17:26:00.1074415 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:965 onnxruntime::python::CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9. and CUDA 12., and the latest MSVC runtime. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported. Fetching 5 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<?, ?it/s] 2024-09-16 17:26:05.8699044 [E:onnxruntime:Default, provider_bridge_ort.cc:1992 onnxruntime::TryGetProviderInfo_CUDA] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1637 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "C:\Users\dibu2\miniconda3\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

2024-09-16 17:26:05.8766607 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:965 onnxruntime::python::CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9. and CUDA 12., and the latest MSVC runtime. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported. Fetching 5 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 4321.35it/s] 2024-09-16 17:26:08.2166429 [E:onnxruntime:Default, provider_bridge_ort.cc:1992 onnxruntime::TryGetProviderInfo_CUDA] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1637 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "C:\Users\dibu2\miniconda3\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

What Python version are you on? e.g. python --version

Python 3.11

Version

0.2.7 (Latest)

What os are you seeing the problem on?

Windows

Relevant stack traces and/or logs

No response

hh-space-invader commented 1 month ago

@dibu28 Can you tell me which CUDA and CuDNN versions do you use ? Can you make sure that the environment variable of CUDA is set ? Start > Control Panel > System and Maintenance > System > In the left-hand pane, select Advanced system settings > Click the Environment Variables button > Under System Variables, scroll to see the variables

dibu28 commented 1 month ago

@hh-space-invader how do to find out? Can you tell me which tools to run?

I've tried some and the gave me different results:

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Wed_Nov_22_10:30:42_Pacific_Standard_Time_2023
Cuda compilation tools, release 12.3, V12.3.107
Build cuda_12.3.r12.3/compiler.33567101_0

nvidia-smi
Thu Sep 19 13:48:51 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 561.09                 Driver Version: 561.09         CUDA Version: 12.6     |

conda list cudatoolkit

# packages in environment at C:\Users\dibu28\miniconda3: 
#
# Name                    Version                   Build  Channel 
cudatoolkit               11.8.0               hd77b12b_0

cuda_version.py
CUDA: 12.1
cuDNN: 8.9.7

cuda_version.py contains this code:

import torch
print('CUDA:',torch.version.cuda)

cudnn = torch.backends.cudnn.version()
cudnn_major = cudnn // 1000
cudnn = cudnn % 1000
cudnn_minor = cudnn // 100
cudnn_patch = cudnn % 100
print( 'cuDNN:', '.'.join([str(cudnn_major),str(cudnn_minor),str(cudnn_patch)]) )

System variables:

qdrant / fastembed