microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.47k stars 2.9k forks source link

EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:123 #21435

Closed skinnynpale closed 2 months ago

skinnynpale commented 3 months ago

Describe the issue

EP Error EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:123 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char, const char, ERRTYPE, const char, const char, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char, const char, ERRTYPE, const char, const char, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 804: forward compatibility was attempted on non supported HW ; GPU=22179 ; hostname=9407c20fa6b6 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_executionprovider.cc ; line=280 ; expr=cudaSetDevice(info.device_id);

when using ['CUDAExecutionProvider'] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.


2024-07-21 23:37:56.704 Uncaught app exception Traceback (most recent call last): File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in init self._create_inference_session(providers, provider_options, disabled_optimizers) File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:123 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char, const char, ERRTYPE, const char, const char, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char, const char, ERRTYPE, const char, const char, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 804: forward compatibility was attempted on non supported HW ; GPU=22179 ; hostname=9407c20fa6b6 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_executionprovider.cc ; line=280 ; expr=cudaSetDevice(info.device_id);

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 589, in _run_script exec(code, module.dict) File "/root/sasha-ai-workflow/src/chat-service/src/v1-streamlit.py", line 6, in from EmbeddingCache import EmbeddingCache File "/root/sasha-ai-workflow/src/chat-service/src/EmbeddingCache.py", line 40, in embedding_model = TextEmbedding( File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/fastembed/text/text_embedding.py", line 61, in init self.model = EMBEDDING_MODEL_TYPE( File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/fastembed/text/onnx_embedding.py", line 237, in init self.load_onnx_model( File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/fastembed/common/onnx_model.py", line 80, in load_onnx_model self.model = ort.InferenceSession( File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 432, in init raise fallback_error from e File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 427, in init self._create_inference_session(self._fallback_providers, None) File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:123 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char, const char, ERRTYPE, const char, const char, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char, const char, ERRTYPE, const char, const char, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 804: forward compatibility was attempted on non supported HW ; GPU=22179 ; hostname=9407c20fa6b6 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_executionprovider.cc ; line=280 ; expr=cudaSetDevice(info.device_id);

To reproduce

!pip install onnxruntime-gpu -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/ -qq !pip install fastembed-gpu -qqq

from typing import List

import numpy as np

from fastembed import TextEmbedding

embedding_model_gpu = TextEmbedding( model_name="BAAI/bge-small-en-v1.5", providers=["CUDAExecutionProvider"] ) embedding_model_gpu.model.model.get_providers()

Urgency

No response

Platform

Linux

OS Version

Ubuntu 22.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.18.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 12.4

tianleiwu commented 3 months ago

For CUDA failure 804: forward compatibility was attempted on non supported HW, the recommendation is to update your cuda driver. See https://forums.developer.nvidia.com/t/forward-compatibility-was-attempted-on-non-supported-hw/204254/6

Try install driver 555 from https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local