RuntimeException: [ONNXRuntimeError] : 6

slliugit commented 1 month ago

Traceback (most recent call last): File "/home/chenghaonan/lsl/hallo2/scripts/inference_long.py", line 511, in save_path = inference_process(command_line_args) File "/home/chenghaonan/lsl/hallo2/scripts/inference_long.py", line 212, in inference_process with ImageProcessor(img_size, face_analysis_model_path) as image_processor: File "/home/chenghaonan/lsl/hallo2/hallo/datasets/image_processor.py", line 100, in init self.face_analysis = FaceAnalysis( File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/app/face_analysis.py", line 31, in init model = model_zoo.get_model(onnx_file, kwargs) File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py", line 96, in get_model model = router.get_model(providers=providers, provider_options=provider_options) File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py", line 40, in get_model session = PickableInferenceSession(self.onnx_file, kwargs) File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py", line 25, in init super().init(model_path, *kwargs) File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in init self._create_inference_session(providers, provider_options, disabled_optimizers) File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 51380224

The error message appears following the instructions provided ”python scripts/inference_long.py --config ./configs/inference/long.yaml.“

AricGamma commented 1 month ago

Failed to allocate memory for requested buffer of size 51380224

What is the size of your GPU memory?

slliugit commented 1 month ago

Traceback (most recent call last):回溯（最近调用最后）： File "/home/chenghaonan/lsl/hallo2/scripts/inference_long.py", line 511, in 文件 “/home/chenghaonan/lsl/hallo2/scripts/inference_long.py”，第 511 行，在 save_path = inference_process(command_line_args)save_path = inference_process（command_line_args） File "/home/chenghaonan/lsl/hallo2/scripts/inference_long.py", line 212, in inference_process文件 “/home/chenghaonan/lsl/hallo2/scripts/inference_long.py”，第 212 行，inference_process with ImageProcessor(img_size, face_analysis_model_path) as image_processor:将 ImageProcessor（img_size， face_analysis_model_path）作为image_processor： File "/home/chenghaonan/lsl/hallo2/hallo/datasets/image_processor.py", line 100, in init文件 “/home/chenghaonan/lsl/hallo2/hallo/datasets/image_processor.py”，第 100 行，init 格式 self.face_analysis = FaceAnalysis(self.face_analysis = FaceAnalysis （ File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/app/face_analysis.py", line 31, in init文件 “/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/app/face_analysis.py”，第 31 行，init 格式 model = model_zoo.get_model(onnx_file, kwargs)模型 = model_zoo.get_model（onnx_file， kwargs） File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py", line 96, in get_model文件“/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py”，第 96 行，get_model model = router.get_model(providers=providers, provider_options=provider_options)模型 = router.get_model（providers=providers， provider_options=provider_options） File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py", line 40, in get_model文件 “/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py”，第 40 行，get_model session = PickableInferenceSession(self.onnx_file, kwargs)会话 = PickableInferenceSession（self.onnx_file， kwargs） File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py", line 25, in init文件 “/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py”，第 25 行，init 格式 super().init(model_path, __kwargs)super（）.init（model_path， kwargs） File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in init文件 “/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py”，第 419 行，init 格式 self._create_inference_session(providers, provider_options, disabled_optimizers)self._create_inference_session（提供商、provider_options、disabled_optimizers） File "/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session文件“/home/chenghaonan/.conda/envs/lsl-hallo2/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py”，第 483 行，_create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers)sess.initialize_session（提供商、provider_options、disabled_optimizers） onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /onnxruntime_src/onnxruntime/core/framework/bfcarena.cc:376 void onnxruntime::BFCArena::AllocateRawInternal(sizet, bool, onnxruntime::Stream, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 51380224onnxruntime.capi.onnxruntime_pybind11_state。RuntimeException：[ONNXRuntimeError] ：6：RUNTIME_EXCEPTION：初始化期间出现异常：/onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc：376 void， bool， onnxruntime：：WaitNotificationFn）无法为大小为 51380224 的请求缓冲区分配内存

The error message appears following the instructions provided ”python scripts/inference_long.py --config ./configs/inference/long.yaml.“错误消息按照提供的说明显示 “python scripts/inference_long.py ./configs/inference/long.yaml”。

Something seems to be going in a magical direction. I reinstalled onnxruntime-gpu as shown in the ‘’‘ pip uninstall onnxruntime-gpu pip install -i https://pypi.tuna.tsinghua.edu.cn/simple onnxruntime-gpu ’‘’ Then the above problem seems to have been solved. However, it reveals the new problem: ''' RuntimeError: GET was unable to find an engine to execute this computation ''' I try to add the instruction in the inference_long.py as ''' torch.backends.cudnn.enabled = False ''' it shows that ''' torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacity of 23.70 GiB of which 9.50 MiB is free. Process 2932 has 725.00 MiB memory in use. Process 591269 has 21.20 GiB memory in use. Including non-PyTorch memory, this process has 1.77 GiB memory in use. Of the allocated memory 630.15 MiB is allocated by PyTorch, and 35.85 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. '''

AricGamma commented 1 month ago

It's weird. The inference process costs about 12-13GB VRAM. 20+GB is enough for inference. 🤕

AricGamma commented 1 month ago

can you upload your test case, such as startup command, config file, input data and logs? I'll try to reproduce your scene.

fudan-generative-vision / hallo2

RuntimeException: [ONNXRuntimeError] : 6 #6