microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.64k stars 2.93k forks source link

Cant get GPU to work with ONNX Runtime 1.19 Cuda 12.6 CuDNN 9 RTXA4000 #21825

Closed sismith999 closed 2 months ago

sismith999 commented 2 months ago

Describe the issue

providers = ['CUDAExecutionProvider'] # Specify the GPU provider session = ort.InferenceSession(model_path, providers=providers) # Create the ONNX Runtime InferenceSession with GPU

Reports error

['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] 2024-08-22 13:01:04.2808367 [E:onnxruntime:Default, provider_bridge_ort.cc:1992 onnxruntime::TryGetProviderInfo_CUDA] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1637 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "c:\Simon\Timpi\test\myenv\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

2024-08-22 13:01:04.2871671 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:965 onnxruntime::python::CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9. and CUDA 12., and the latest MSVC runtime. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported. ONNX Runtime is using the CPU (CPUExecutionProvider).

To reproduce

providers = ['CUDAExecutionProvider'] # Specify the GPU provider session = ort.InferenceSession(model_path, providers=providers) # Create the ONNX Runtime InferenceSession with GPU

DUMPBIN on DLL

cublasLt64_12.dll          Directory: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin
cublas64_12.dll
cudnn64_9.dll
cufft64_11.dll
cudart64_12.dll
onnxruntime_providers_shared.dll
KERNEL32.dll
MSVCP140.dll
VCRUNTIME140.dll
VCRUNTIME140_1.dll
api-ms-win-crt-heap-l1-1-0.dll
api-ms-win-crt-math-l1-1-0.dll
api-ms-win-crt-runtime-l1-1-0.dll
api-ms-win-crt-stdio-l1-1-0.dll
api-ms-win-crt-string-l1-1-0.dll

cublasLt64_12.dll:  Found at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin\cublasLt64_12.dll
cublas64_12.dll:    Found at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin\cublas64_12.dll
cudnn64_9.dll:      Found at C:\Program Files\NVIDIA\CUDNN\v9.3\bin\11.8\cudnn64_9.dll
cufft64_11.dll:     Found at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin\cufft64_11.dll
cudart64_12.dll:    Found at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin\cudart64_12.dll
onnxruntime_providers_shared.dll:   Not found
KERNEL32.dll: Not found
MSVCP140.dll:       Found at C:\Program Files (x86)\NortonInstaller\{0C55C096-0F1D-4F28-AAA2-85EF591126E7}\NGC\A5E82D02\22.23.1.21\MSVCP140.dll
VCRUNTIME140.dll:   Found at C:\Program Files (x86)\NortonInstaller\{0C55C096-0F1D-4F28-AAA2-85EF591126E7}\NGC\A5E82D02\22.23.1.21\VCRUNTIME140.dll
VCRUNTIME140_1.dll: Not found
api-ms-win-crt-heap-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-heap-l1-1-0.dll
api-ms-win-crt-math-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-math-l1-1-0.dll
api-ms-win-crt-runtime-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-runtime-l1-1-0.dll
api-ms-win-crt-stdio-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-stdio-l1-1-0.dll
api-ms-win-crt-string-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-string-l1-1-0.dll

Urgency

Urgent

Platform

Windows

OS Version

11

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

11.9

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

cuda 12.6

tianleiwu commented 2 months ago

Two are suspicious:

Please install latest MSVC runtime: https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170

Also install cudnn 9.3 for cuda 12 like https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/windows-x86_64/cudnn-windows-x86_64-9.3.0.75_cuda12-archive.zip. You can unzip the folder, and add the lib directory to PATH.

sismith999 commented 2 months ago

THanks for your reply, i checked the items you mentioned and removed some old directories but i still cant get it to work... I downgraded to CUDA 12.4 and DNN 3.9 

I tested Cuda with Pytorch below:- import torch

import sys# import os# # Paths to CUDA DLLs# cuda_bin_path = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin"# cuda_libnvvp_path = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\libnvvp"# cuda_dnn = r"C:\Program Files\NVIDIA\CUDNN\v9.3\bin\12.6"    ## r"C:\Program Files\NVIDIA\CUDNN\v9.3\bin"# venv = r"C:\Simon\Timpi\test\myenv\Lib\site-packages\onnxruntime\capi";# # Add them to sys.path# sys.path.append(cuda_bin_path)# sys.path.append(cuda_libnvvp_path)# sys.path.append(cuda_dnn)# sys.path.append(venv)# # Optional: Also set the PATH environment variable (for subprocesses)# os.environ["PATH"] = cuda_bin_path + os.pathsep + cuda_libnvvp_path + os.pathsep + os.environ["PATH"] + os.pathsep + cuda_dnn + os.pathsep + venv

def test_cuda_and_cudnn():    # Check if CUDA is available    print(torch.version.cuda)    print(torch.cuda.get_device_name(0))     if torch.cuda.is_available():        print("CUDA is available.")                # Get the CUDA version        print("CUDA version:", torch.version.cuda)                # Get the cuDNN version        print("cuDNN version:", torch.backends.cudnn.version())                # Create a tensor on GPU        x = torch.randn(3, 3).cuda()        print("Tensor on GPU:", x)                # Perform a simple operation        y = x @ x.T        print("Result of a simple matrix multiplication on GPU:", y)    else:        print("CUDA is not available. Please check your installation.") if name == "main":    test_cuda_and_cudnn()

and this works... 

NVIDIA RTX A4000CUDA is available.CUDA version: 12.4cuDNN version: 90100Tensor on GPU: tensor([[ 0.7406,  0.3850,  0.5371],        [-1.2015,  0.0742,  0.4274],        [ 0.6611,  0.1410, -1.6860]], device='cuda:0')Result of a simple matrix multiplication on GPU: tensor([[ 0.9851, -0.6317, -0.3616],        [-0.6317,  1.6317, -1.5044],        [-0.3616, -1.5044,  3.2995]], device='cuda:0')

NOTE my dlls are now 

cublasLt64_12.dll: Found at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin\cublasLt64_12.dllcublas64_12.dll: Found at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin\cublas64_12.dllcudnn64_9.dll: Found at C:\Program Files\NVIDIA\CUDNN\v9.3\bin\11.8\cudnn64_9.dllcufft64_11.dll: Found at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin\cufft64_11.dllcudart64_12.dll: Found at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin\cudart64_12.dllonnxruntime_providers_shared.dll: Found at \Simon\Timpi\test\myenv\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_shared.dllKERNEL32.dll: Found at C:\WINDOWS\System32\kernel32.dllMSVCP140.dll: Found at C:\Program Files\Blender Foundation\Blender 3.1\blender.crt\msvcp140.dllVCRUNTIME140.dll: Found at C:\Program Files\Azure Data Studio\tools\vcruntime140.dllVCRUNTIME140_1.dll: Found at C:\Program Files\Blender Foundation\Blender 3.1\blender.crt\vcruntime140_1.dllapi-ms-win-crt-heap-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-heap-l1-1-0.dllapi-ms-win-crt-math-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-math-l1-1-0.dllapi-ms-win-crt-runtime-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-runtime-l1-1-0.dllapi-ms-win-crt-stdio-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-stdio-l1-1-0.dllapi-ms-win-crt-string-l1-1-0.dll: Found at C:\Program Files\Azure Data Studio\resources\app\extensions\mssql\sqltoolsservice\Windows\4.0.1.1\api-ms-win-crt-string-l1-1-0.dll

note onnxruntime-gpu is 1.18:-   Downloading https://aiinfra.pkgs.visualstudio.com/2692857e-05ef-43b4-ba9c-ccf1c22c437c/_packaging/7982ae20-ed19-4a35-a362-a96ac99897b7/pypi/download/onnxruntime-gpu/1.18.1/onnxruntime_gpu-1.18.1-cp312-cp312-win_amd64.whl

error still

['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']2024-08-22 18:50:44.3526054 [E:onnxruntime:Default, provider_bridge_ort.cc:1745 onnxruntime::TryGetProviderInfo_CUDA] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1426 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "c:\Simon\Timpi\test\myenv\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll" 2024-08-22 18:50:44.3590433 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:895 onnxruntime::python::CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirementsto ensure all dependencies are met.

any advice is appreciated...

thanks Simon.

On Thursday 22 August 2024 at 13:47:13 GMT+7, Tianlei Wu ***@***.***> wrote:  

Two are suspicious:

Please install latest MSVC runtime: https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170

Also install cudnn 9.3 for cuda 12 like https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/windows-x86_64/cudnn-windows-x86_64-9.3.0.75_cuda12-archive.zip. You can unzip the folder, and add the bin directory to PATH.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

tianleiwu commented 2 months ago

Here is one way to setup a python environment for onnxruntime-gpu of CUDA 12 and python 3.12 in Windows:

(1) Install Ana Conda: https://docs.anaconda.com/anaconda/install/windows/ (2) Run the following to create a conda environment:

conda create -n py312 python=3.12
conda activate py312
pip install msvc-runtime==14.40.33807
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
pip install nvidia-cudnn-cu12==9.3.0.75 nvidia-cuda-runtime-cu12==12.5.82 nvidia-cufft-cu12==11.2.3.61
pip install onnxruntime-gpu==1.19.0

Then you can test your environment like the following:

import os
import ctypes
import subprocess
import sys

def print_loaded_libraries():
    import psutil
    pid = os.getpid()
    process = psutil.Process(pid)
    for dll in process.memory_maps():
        print(dll.path)

def install_package(package_name, version):
    print(f"Installing {package_name} {version}...")
    subprocess.check_call([sys.executable, "-m", "pip", "install", f"{package_name}=={version}"])
    print(f"{package_name} {version} has been installed successfully.")

# Optional for AnaConda. It is recommended to install  explicitly if you are not sure system has latest VC Runtime.
install_package("msvc-runtime", "14.40.33807")

install_package("nvidia-cudnn-cu12", "9.3.0.75")
install_package("nvidia-cuda-runtime-cu12", "12.5.82")
install_package("nvidia-cufft-cu12", "11.2.3.61")
install_package("onnxruntime-gpu", "1.19.0")

# psutil is used in print_loaded_libraries()
install_package("psutil", "6.0.0")

# Get the path to the Conda environment
conda_env = os.environ.get("CONDA_PREFIX")
if conda_env is None:
    raise EnvironmentError("Conda environment is not active.")

# import sysconfig
# site_packages_directory = sysconfig.get_path('purelib')
site_packages_directory = os.path.join(conda_env, 'Lib', 'site-packages')

cudnn_dll_directory =  os.path.join(site_packages_directory, 'nvidia', 'cudnn', 'bin')
cuda_runtime_directory = os.path.join(site_packages_directory, 'nvidia', 'cuda_runtime', 'bin')
cufft_dll_directory = os.path.join(site_packages_directory, 'nvidia', 'cufft', 'bin')
cublas_dll_directory = os.path.join(site_packages_directory, 'nvidia', 'cublas', 'bin')
ort_dll_directory = os.path.join(site_packages_directory, 'onnxruntime', 'capi')
msvc_dll_directory = conda_env

dll_to_directory = {
    "cublasLt64_12.dll" : cublas_dll_directory,
    "cublas64_12.dll" : cublas_dll_directory,
    "cudnn64_9.dll" : cudnn_dll_directory,
    "cufft64_11.dll" : cufft_dll_directory,
    "cudart64_12.dll" : cuda_runtime_directory,
    "MSVCP140.dll" : conda_env,
    "VCRUNTIME140.dll" : conda_env,
    "VCRUNTIME140_1.dll" : conda_env,
    "onnxruntime_providers_cuda.dll" : ort_dll_directory,
}

unique_directories = set(dll_to_directory.values())
for directory in unique_directories:
    os.add_dll_directory(directory)

# Preload could avoid some issues related to PATH setting.
enable_preload = True

preloaded_dlls = []
if enable_preload:
    for filename, directory in dll_to_directory.items():
        dll_full_path = os.path.join(directory, filename)
        try:
            dll = ctypes.CDLL(dll_full_path)
            preloaded_dlls.append(dll)
            print(f"Successfully loaded DLL: {dll_full_path}")
        except OSError as e:
            print(f"Failed to load cuDNN DLL: {dll_full_path}")
else:
    # The following can be used to test whether you have set PATH environment variable correctly.
    dependent_dlls = ["cublasLt64_12.dll", "cublas64_12.dll", "cudnn64_9.dll", "cufft64_11.dll", "cudart64_12.dll", "onnxruntime_providers_shared.dll", "MSVCP140.dll", "VCRUNTIME140.dll", "VCRUNTIME140_1.dll"]
    for dll_name in dependent_dlls:
        try:
            my_dll = ctypes.CDLL(dll_name)
            print(f"Successfully loaded the {dll_name}.")
        except OSError as e:
            print(f"Failed to load the {dll_name}: {e}")    

# Test import onnxruntime.
import onnxruntime

# Verify that the path of DLLs to make sure the right DLLs are loaded.
print_loaded_libraries()
yuslepukhin commented 2 months ago

This is needed: onnxruntime_providers_shared.dll: Not found

jywu-msft commented 2 months ago

This is needed: onnxruntime_providers_shared.dll: Not found

the latest output has onnxruntime_providers_shared.dll: Found at \Simon\Timpi\test\myenv\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_shared.dll so i don't think that is the issue.

i think the version of cudnn is probably still the issue,

cudnn64_9.dll: Found at C:\Program Files\NVIDIA\CUDNN\v9.3\bin\11.8\cudnn64_9.dll

it's finding a version of cudnn9 that is for cuda 11.8 , not cuda 12.x

sismith999 commented 2 months ago

Thanks to everyone who commented, you can close this now i did finally get it to work... a few notes

A) my environment was complicated with many python versions installed so be careful with testing ( my python3 was going to systemwide dll vs python going to my venv/conda)

B) HAD TO USE THESE VERSION COMBINATIONS... CUDA 12.4 AND CUDNN 3.9 OK IF PATH SEE BELOW:-

cuda_bin_path = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin" (INSTALLED FROM Nvidia installer..)

cuda_dnn = "D:\DOWN\CUDNN_help\cudnn-windows-x86_64-9.3.0.75_cuda12-archive\cudnn-windows-x86_64-9.3.0.75_cuda12-archive\bin"

note ACTUALLY NVIDIA 3.9 INSTALLER FOR CUDNN 3.9 DOES SUFFICE IF YOU ADD THIS PATH CAREFULLY, NOTE THE INSTALL SEEMS TO CREATE SUBFOLDERS FROM BIN \ 11.7 AND 12.6, SO YOUHAVE TO SPECIFY LIKE BELOW... (NOTE 12.6 DID SEEM OK AS PER NVIDIA COMPAT MATRIX EVEN THOUGH MY CUDA WAS 12.4) C:\Program Files\NVIDIA\CUDNN\v9.3\bin\12.6 NOT C:\Program Files\NVIDIA\CUDNN\v9.3\bin