Live works, single image works, video conversion doesn't

theSplund commented 2 months ago

CUDA 11.8 is installed:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

cudnn 9.2.0 Python 3.10.10 Specs: RTX2060S - 8GB; i5 9600K; 16GB RAM; sat on SSD; Windows 10 64-bit Single image works; Live works well (most impressed); but video conversion doesn't. I;'ve set it to the default of all switches off, except audio (though I have tested it with other combinations), and it creates a temp folder, and subfolder, and extracts the png images, but I get a batch of errors starting with this:

[DLC.CORE] Creating temp resources...
[DLC.CORE] Extracting frames...
[DLC.FACE-SWAPPER] Progressing...
Processing:   0%| | 0/102 [00:00<?, ?frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=60, max_[ONNXRuntimeError] : 1 : FAIL : D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:121 onnxruntime::CudaCall D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:114 onnxruntime::CudaCall CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=DESKTOP-IH921FK ; file=D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_stream_handle.cc ; line=50 ; expr=cublasCreate(&cublas_handle_);

with several repeated similar errors and finally ending with this:

Processing:  24%|▏| 24/102 [00:03<00:09,  7.94frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads2024-08-17 11:19:16.2982638 [E:onnxruntime:, inference_session.cc:1645 onnxruntime::InferenceSession::Initialize::<lambda_eb486adf513608dcd45c034ea7ffb8e8>::operator ()] Exception during initialization: D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:121 onnxruntime::CudaCall D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:114 onnxruntime::CudaCall CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=DESKTOP-IH921FK ; file=D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_execution_provider.cc ; line=164 ; expr=cublasCreate(&cublas_handle_);

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:121 onnxruntime::CudaCall D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:114 onnxruntime::CudaCall CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=DESKTOP-IH921FK ; file=D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_execution_provider.cc ; line=164 ; expr=cublasCreate(&cublas_handle_);

Processing:  25%|▏| 25/102 [00:03<00:11,  6.43frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads
Exception in Tkinter callback
Traceback (most recent call last):
  File "T:\Deep-Live-Cam-cuda\python\lib\tkinter\__init__.py", line 1921, in __call__
    return self.func(*args)
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\customtkinter\windows\widgets\ctk_button.py", line 554, in _clicked
    self._command()
  File "T:\Deep-Live-Cam-cuda\modules\ui.py", line 95, in <lambda>
    start_button = ctk.CTkButton(root, text='Start', cursor='hand2', command=lambda: select_output_path(start))
  File "T:\Deep-Live-Cam-cuda\modules\ui.py", line 192, in select_output_path
    start()
  File "T:\Deep-Live-Cam-cuda\modules\core.py", line 200, in start
    frame_processor.process_video(modules.globals.source_path, temp_frame_paths)
  File "T:\Deep-Live-Cam-cuda\modules\processors\frame\face_swapper.py", line 86, in process_video
    modules.processors.frame.core.process_video(source_path, temp_frame_paths, process_frames)
  File "T:\Deep-Live-Cam-cuda\modules\processors\frame\core.py", line 72, in process_video
    multi_process_frame(source_path, frame_paths, process_frames, progress)
  File "T:\Deep-Live-Cam-cuda\modules\processors\frame\core.py", line 64, in multi_process_frame
    future.result()
  File "T:\Deep-Live-Cam-cuda\python\lib\concurrent\futures\_base.py", line 458, in result
    return self.__get_result()
  File "T:\Deep-Live-Cam-cuda\python\lib\concurrent\futures\_base.py", line 403, in __get_result
    raise self._exception
  File "T:\Deep-Live-Cam-cuda\python\lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "T:\Deep-Live-Cam-cuda\modules\processors\frame\face_swapper.py", line 65, in process_frames
    source_face = get_one_face(cv2.imread(source_path))
  File "T:\Deep-Live-Cam-cuda\modules\face_analyser.py", line 20, in get_one_face
    face = get_face_analyser().get(frame)
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\insightface\app\face_analysis.py", line 75, in get
    model.get(img, face)
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\insightface\model_zoo\arcface_onnx.py", line 67, in get
    face.embedding = self.get_feat(aimg).flatten()
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\insightface\model_zoo\arcface_onnx.py", line 84, in get_feat
    net_out = self.session.run(self.output_names, {self.input_name: blob})[0]
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 217, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:121 onnxruntime::CudaCall D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:114 onnxruntime::CudaCall CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=DESKTOP-IH921FK ; file=D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_stream_handle.cc ; line=50 ; expr=cublasCreate(&cublas_handle_);

This above was with a 3 second 1280x720 MP4, but I also created a half second, 20 frame, 360x240 MP4 (using ffmpeg) which failed after 50% - I also tried a 50 frame video which got as far as 73% before failing. Any idea why this is happening?

Pascal2708 commented 2 months ago

Similar problem for me. I reinstalled everything today and everything works with pictures.

With videos, however, the GUI hangs and shows no progress. In Powershell, however, everything runs and at 100% it aborts without comment and the GUI closes. The preview works.

Ryzen 3700X 32GB DDR4 RTX 3080 TI

D2OKAY commented 1 month ago

So for me I had to debug it a bit.. and lucky I was able to find it.

in the repo, go to modules > utilties.py where you will find def create_video method, I had to change this a bit to figure out why the video was not getting created, so likely chances are that your ffmpeg doesn't have the necessary encoder installed (for me it was that).

in terminal run : ffmpeg -encoders | grep This will show what encoders are installed, for this repo the encoders are :

(default) libx264 <-- default encoder
libx265, libvpx-vp9 <-- these are also set as 'choices'.. not too sure on the code logic on how it selects it.

I didn't have the default encoder in my ffmpeg (i used pip install ffmpeg), so I had to install it another way: try : (macOS) brew install ffmpeg || (conda) conda install -c conda-forge ffmpeg

theSplund commented 1 month ago

I guess that as you've used the terms 'terminal' and 'grep' that this is Linux orientated? I couldn't get it to run in Linux, yet, only Win10, but I did check 'ffmpeg -encoders' in a command window (grep wasn't allowed) and these were amongst the reported: V....D libx264 libx264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (codec h264) and V....D libx265 libx265 H.265 / HEVC (codec hevc) and V....D libvpx-vp9 libvpx VP9 (codec vp9) so that might not be the fix. The fact that the application appears to successfully convert the movie into .png files but then fails part way through the process of replacing faces in the .pngs leads me to suspect that a codec issue isn't at the heart of it (TBH I suspect a memory issue, but it's a guess). Thanks anyway

theSplund commented 1 month ago

Well, I got bored with this and decided to save some SSD space, and so I moved it to an HDD. A week or so later I tried it again, and it works with movies! Very strange. Closing it

Lxtharia commented 1 day ago

I'm facing the same issue. On the stable release. Arch Linux, cuda 11.8.0-1, python 3.10. Live works, single image works, but processing a video does not.

This is the output I'm getting when calling

python run.py --execution-provider cuda --source face.jpg --target ~/Videos/vid.mp4 --output ~/Desktop/out.mp4 --keep-audio --keep-fps

Show log

``` Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}} find model: /home/lin/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0 Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}} find model: /home/lin/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0 Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}} find model: /home/lin/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0 Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}} find model: /home/lin/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0 Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}} find model: /home/lin/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5 set det-size: (640, 640) [DLC.CORE] Processing... [DLC.CORE] Creating temp resources... [DLC.CORE] Extracting frames... [DLC.FACE-SWAPPER] Progressing... Processing: 0%| | 0/153 [00:00 onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 2024-10-29 03:22:54.851053085 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 2024-10-29 03:22:54.858262199 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 2024-10-29 03:22:55.222706892 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); [ONNXRuntimeError] : 1 : FAIL : /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=51 ; expr=cublasCreate(&cublas_handle_); Processing: 5%|████▉ | 8/153 [00:02<00:34, 4.19frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16][ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 2024-10-29 03:22:55.388265396 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_46' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 18884864 [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_46' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 18884864 # ... and so on ... Processing: 80%|█████████████████████████████████████████████████████████████████████████▉ | 123/153 [00:08<00:01, 18.32frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16]2024-10-29 03:19:19.275439003 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 2024-10-29 03:19:19.294210553 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 2024-10-29 03:19:19.308650151 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 2024-10-29 03:19:19.349116559 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fullyconnected0' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Gemm node. Name:'fullyconnected0' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); Processing: 82%|███████████████████████████████████████████████████████████████████████████▊ | 126/153 [00:08<00:01, 19.09frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16]2024-10-29 03:19:19.414915318 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 2024-10-29 03:19:19.450319818 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 2024-10-29 03:19:19.535050928 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 Processing: 84%|█████████████████████████████████████████████████████████████████████████████▌ | 129/153 [00:09<00:01, 18.18frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16]2024-10-29 03:19:19.553050835 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 2024-10-29 03:19:19.622111305 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 2024-10-29 03:19:19.641810931 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 2024-10-29 03:19:19.642229023 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 Processing: 86%|███████████████████████████████████████████████████████████████████████████████▎ | 132/153 [00:09<00:01, 20.29frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16][ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368 Processing: 87%|███████████████████████████████████████████████████████████████████████████████▉ | 133/153 [00:09<00:01, 14.56frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16] Traceback (most recent call last): File "/home/lin/Desktop/Deep-Live-Cam/run.py", line 6, in core.run() File "/home/lin/Desktop/Deep-Live-Cam/modules/core.py", line 252, in run start() File "/home/lin/Desktop/Deep-Live-Cam/modules/core.py", line 209, in start frame_processor.process_video(modules.globals.source_path, temp_frame_paths) File "/home/lin/Desktop/Deep-Live-Cam/modules/processors/frame/face_swapper.py", line 174, in process_video modules.processors.frame.core.process_video(source_path, temp_frame_paths, process_frames) File "/home/lin/Desktop/Deep-Live-Cam/modules/processors/frame/core.py", line 73, in process_video multi_process_frame(source_path, frame_paths, process_frames, progress) File "/home/lin/Desktop/Deep-Live-Cam/modules/processors/frame/core.py", line 65, in multi_process_frame future.result() File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/concurrent/futures/_base.py", line 451, in result return self.__get_result() File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result raise self._exception File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/home/lin/Desktop/Deep-Live-Cam/modules/processors/frame/face_swapper.py", line 133, in process_frames source_face = get_one_face(cv2.imread(source_path)) File "/home/lin/Desktop/Deep-Live-Cam/modules/face_analyser.py", line 28, in get_one_face face = get_face_analyser().get(frame) File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/site-packages/insightface/app/face_analysis.py", line 59, in get bboxes, kpss = self.det_model.detect(img, File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/site-packages/insightface/model_zoo/retinaface.py", line 224, in detect scores_list, bboxes_list, kpss_list = self.forward(det_img, self.det_thresh) File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/site-packages/insightface/model_zoo/retinaface.py", line 152, in forward net_outs = self.session.run(self.output_names, {self.input_name : blob}) File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run return self._sess.run(output_names, input_feed, run_options) onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; std::conditional_t = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; std::conditional_t = void] CUDNN failure 4: CUDNN_STATUS_INTERNAL_ERROR ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=53 ; expr=cudnnCreate(&cudnn_handle_); ```

So I'd like to ask for this issue to be reopened :)

Lxtharia commented 14 hours ago

Apparently it works when setting the execution threads to 4

hacksider / Deep-Live-Cam

Live works, single image works, video conversion doesn't #383