microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.88k stars 2.95k forks source link

CUDAExecutionProvider Not Available / GPU Not Visible on NVIDIA T4 #7748

Closed anshoomehra closed 3 years ago

anshoomehra commented 3 years ago

Describe the bug Do not see CUDAExecutionProvider or GPU available from ONNX Runtime even though onnxruntime-gpu is installed.

Urgency In critical stage of project & hence urgent.

System information

To Reproduce

ort.get_device()

'CPU'

ort.get_available_providers()

['CPUExecutionProvider']

Force setting code to CUDAExecutionProvider results in obvious error & inline with above telemetry:

import onnxruntime as ort

model_path = 'models/bart_large_cnn_fl/encoder/model.onnx'

providers = [
    ('CUDAExecutionProvider', {
        'device_id': 0,
        'arena_extend_strategy': 'kNextPowerOfTwo',
        'cuda_mem_limit': 2 * 1024 * 1024 * 1024,
        'cudnn_conv_algo_search': 'EXHAUSTIVE',
        'do_copy_in_default_stream': True,
    }),
    'CPUExecutionProvider',
]

session = ort.InferenceSession(model_path, providers=providers)

Error MSG image

Expected behavior Expecting device as GPU and provider as CUDAExecutionProvider

hariharans29 commented 3 years ago

Did you install both onnxruntime and onnxruntime_gpu on your machine ? If so, can you try after uninstalling onnxruntime with pip uninstall onnxruntime ?

anshoomehra commented 3 years ago

@hariharans29 yes, I did uninstall and reinstall everything clean still the same issue. Sequence of install onnx, onnxruntime, onnxruntime-gpu…

onnx==1.9.0 onnxruntime==1.7.0 onnxruntime-gpu==1.7.0

I also just kept onnxruntime-gpu, uninstalling onnxruntime and import fails ..

hariharans29 commented 3 years ago

You don't need onnx/onnxruntime installed if you plan to use onnxruntime_gpu.

Can you uninstall everything and just install onnxruntime_gpu ?

hariharans29 commented 3 years ago

If there is an import failure with just onnxruntime_gpu installed, atleast it is on the right track. I suspect you are missing some dependencies listed here - https://www.onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements. Make sure these are installed before trying the import again.

anshoomehra commented 3 years ago

If there is an import failure with just onnxruntime_gpu installed, atleast it is on the right track. I suspect you are missing some dependencies listed here - https://www.onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements. Make sure these are installed before trying the import again.

@hariharans29 Yes the import fails with just onnxruntime_gpu, let me check on dependencies & revert back if that solves the issue ..

appreciate your time & prompt responses - thanks

anshoomehra commented 3 years ago

@hariharans29 I tried uninstalling onnxruntime and imports start to fail .. Should the import paths remain the same post uninstall??

from onnxruntime import (
    ExecutionMode,
    GraphOptimizationLevel,
    InferenceSession,
    SessionOptions,
)

image

image

Also, I tried looking for dependencies, these does not look like Python Libs, can you please guide on how can I verify the existence and versions of these?

image

anshoomehra commented 3 years ago

@hariharans29 It worked!! I guess the issue was below :

I was installing onnx, onnxruntime and onnxruntime-gpu in this sequence and later uninstalled onnxruntime only which created import issues. Instead, I uninstalled onnx as well this time and just installed onnxruntime-gpu from sctrach and that made it work, so for some reason onnx was conflicting the libs it seems ...

Thank you so much for helping me get to this stage :-)

I see one issue though with this new method, one of the imports for quantize libs still fail, listed below - any suggestions on how to resolve this ?

image

hariharans29 commented 3 years ago

@yufenglee - would you know about these quantization libs ?

@anshoomehra - The libraries listed above are CUDA RT, CuDNN libs. There is documentation on the web that shoul help you check for these. They are dependencies for our GPU builds.

anshoomehra commented 3 years ago

@hariharans29 I am running into one another error, not sure if you would suggest open new issue for this?

I am able to export the model successfully with 'CUDAExecutionProvider', however running inference on GPU seems not load all variables to GPU resulting in the below error, any other steps I may need apart from switching devices to GPU for model and tokenizer to make it fully run on GPU?

image

image

hariharans29 commented 3 years ago

Please open a new issue with more details and a complete repro script

anshoomehra commented 3 years ago

@hariharans29 logged new issue as per your suggestion.

We can close this post @yufenglee resolves the quantization libraries import issue with onnxruntime-gpu.

Thanks again for all your help.

FrancescoSaverioZuppichini commented 2 years ago

I have the same issue, using nvidia container nvcr.io/nvidia/pytorch:22.08-py3

Installing pip install onnxruntime-gpu==1.12 fixed the issue!

[EDIT] it looks like it's random, it's not always working

Ahmet0691 commented 1 year ago

from optimum.onnxruntime import ORTModelForQuestionAnswering from transformers import pipeline, AutoTokenizer import time import torch onnx_path = "C:\\Users\\Users\\PycharmProjects\\pythonProject10\\onnx" model = ORTModelForQuestionAnswering.from_pretrained(onnx_path,file_name="model_quantized.onnx", device =0) tokenizer = AutoTokenizer.from_pretrained(onnx_path) nlp = pipeline("question-answering",model=model, tokenizer=tokenizer, device=0) question = input (str("any questions : ")) with open('text.txt', 'r', encoding='utf8') as file: context = file.read() start_time = time.time() result = nlp(question= question, context=context) end_time = time.time() print(result) print("Latency:", end_time - start_time)

Error Massage Traceback (most recent call last): File "C:\Users\Users\PycharmProjects\pythonProject10\quentiz.py", line 8, in nlp = pipeline("question-answering",model=model, tokenizer=tokenizer, device=0) File "C:\Users\Users.virtualenvs\pythonProject10\lib\site-packages\transformers\pipelines__init.py", line 903, in pipeline return pipeline_class(model=model, framework=framework, task=task, **kwargs) File "C:\Users\Users.virtualenvs\pythonProject10\lib\site-packages\transformers\pipelines\question_answering.py", line 269, in init **kwargs, File "C:\Users\Users.virtualenvs\pythonProject10\lib\site-packages\transformers\pipelines\base.py", line 780, in init__ self.model = self.model.to(self.device) File "C:\Users\Users.virtualenvs\pythonProject10\lib\site-packages\optimum\onnxruntime\modeling_ort.py", line 269, in to validate_provider_availability(provider) # raise error if the provider is not available File "C:\Users\Users.virtualenvs\pythonProject10\lib\site-packages\optimum\onnxruntime\utils.py", line 197, in validate_provider_availability f"Asked to use {provider}, but onnxruntime-gpu package was not found. Make sure to install onnxruntime-gpu package instead of onnxruntime." ImportError: Asked to use CUDAExecutionProvider, but onnxruntime-gpu package was not found. Make sure to install onnxruntime-gpu package instead of onnxruntime.

Process finished with exit code 1

I uninstalled OnnxRuntime and removed OnnxRuntime-GPU, then reinstalled, tried these steps multiple times but could not find a solution. Please Help..

thawro commented 1 year ago

In my case the following helped:

  1. uninstall onnxruntime
  2. uninstall onnxruntime-gpu
  3. install optimum[onnxruntime-gpu]

more here

takahashi-shotaro-al commented 11 months ago

it works for me

  1. pip uninstall onnxruntime onnxruntime-gpu
  2. pip install onnxruntime-gpu
merveenoyan commented 10 months ago

I'm having the same error, and I have never installed onnxruntime and onnx but only installed onnxruntime-gpu. I also checked the onnxruntime-gpu to CUDA compatibility. This is on latest version of Colab.

import onnxruntime

providers = onnxruntime.get_available_providers()
print(providers)

#  ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']

and I didn't even notice that the session wasn't utilizing the GPU until benchmarking against CPU, so I did:

X_ortvalue = onnxruntime.OrtValue.ortvalue_from_numpy(input_data, 'cuda', 0)

and started getting:

RuntimeError: /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1554 onnxruntime::ProviderInfo_CUDA& onnxruntime::GetProviderInfo_CUDA() CUDA Provider not available, can't get interface for it
Do7and commented 8 months ago

it works for me

  1. pip uninstall onnxruntime onnxruntime-gpu
  2. pip install onnxruntime-gpu

Works for me, thank you

JHW5981 commented 2 months ago

it works for me

  1. pip uninstall onnxruntime onnxruntime-gpu
  2. pip install onnxruntime-gpu

This works with error: "ValueError: Asked to use CUDAExecutionProvider as an ONNX Runtime execution provider, but the available execution providers are ['AzureExecutionProvider', 'CPUExecutionProvider']."

Asachanel commented 1 day ago

I had the same issue, I uninstalled onnx, onnxruntime as well as onnxruntime-gpu, and cleaned up the uninstall remnants inside site-packages, only onnxruntime-gpu was installed, and finally it worked