Open mbac opened 6 days ago
Try to downgrade Cuda to 12.1 I use torch 2.5.1+cu121 alongside ort-gpu 1.20.1 and everything works pretty fine with no errors
Thanks for the info, but downgrading is really a problematic option, both for the risk of even more conflicts with other libraries and the fact that the cloud provider I'm invested only has 12.4 images, which means I'd have to run docker on top of the installation.
After running several updates, I got to this point (notice slightly different error message)… Can you make anything else out of it?
# ComfyUI Error Report
## Error Details
- **Node ID:** 3585
- **Node Type:** ReActorFaceSwap
- **Exception Type:** onnxruntime.capi.onnxruntime_pybind11_state.Fail
- **Exception Message:** [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Conv node. Name:'Conv_0' Status Message: CUDNN_FE failure 7: GRAPH_EXECUTION_FAILED ; GPU=0 ; hostname=eedcce05-9d6c-4f8f-8246-9d7b38a3f200 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=483 ; expr=s_.cudnn_fe_graph->execute(cudnn_handle, s_.variant_pack, ws.get());
## Stack Trace
File "/workspace/ComfyUI/execution.py", line 323, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
process_inputs(input_dict, i)
File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
results.append(getattr(obj, func)(**inputs))
File "/workspace/ComfyUI/custom_nodes/comfyui-reactor-node/nodes.py", line 353, in execute
script.process(
File "/workspace/ComfyUI/custom_nodes/comfyui-reactor-node/scripts/reactor_faceswap.py", line 101, in process
result = swap_face(
File "/workspace/ComfyUI/custom_nodes/comfyui-reactor-node/scripts/reactor_swapper.py", line 275, in swap_face
source_faces = analyze_faces(source_img)
File "/workspace/ComfyUI/custom_nodes/comfyui-reactor-node/scripts/reactor_swapper.py", line 181, in analyze_faces
faces = face_analyser.get(img_data)
File "/workspace/ComfyUI/vcomfy/lib/python3.10/site-packages/insightface/app/face_analysis.py", line 59, in get
bboxes, kpss = self.det_model.detect(img,
File "/workspace/ComfyUI/vcomfy/lib/python3.10/site-packages/insightface/model_zoo/retinaface.py", line 224, in detect
scores_list, bboxes_list, kpss_list = self.forward(det_img, self.det_thresh)
File "/workspace/ComfyUI/vcomfy/lib/python3.10/site-packages/insightface/model_zoo/retinaface.py", line 152, in forward
net_outs = self.session.run(self.output_names, {self.input_name : blob})
File "/workspace/ComfyUI/vcomfy/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 266, in run
return self._sess.run(output_names, input_feed, run_options)
## System Information
- **ComfyUI Version:** v0.3.5-11-g20a560eb
- **Arguments:** /workspace/ComfyUI/main.py --output-directory /workspace/output/comfyui
- **OS:** posix
- **Python Version:** 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0]
- **Embedded Python:** false
- **PyTorch Version:** 2.5.1+cu124
## Devices
- **Name:** cuda:0 NVIDIA GeForce RTX 4090 : cudaMallocAsync
- **Type:** cuda
- **VRAM Total:** 25386352640
- **VRAM Free:** 4413828794
- **Torch VRAM Total:** 19662897152
- **Torch VRAM Free:** 73051834
Thanks!
First, confirm
What happened?
Hi,
AFAIK the main issue with CUDA 12.x is older versions of
onnxruntime-[gpu]
, so when I got one of those error messages, I went around searching and found that the latest stable version available for my system (Ubuntu Linux on AMD CPU, Nvidia A6000 GPU) is 1.20.1. According to Microsoft, it should work with any 12.x CUDA version that they know of sinceonnxruntime-gpu 1.19.0
, and I'm on1.20.1
… and yet…… and so on, and so forth…
My status is:
DWPose nodes (can't recall the package name) for Comfy , which are usually super-picky about the
onnxruntime-gpu
version, report GPU acceleration is detected just fine.Can anyone help please?
Steps to reproduce the problem
torch 2.5.1 CUDA 12.4 cuDNN 9.2.1
comfyui-reactor-node
package and run itsinstall.py
onnxruntime-gpu
installed first, currently version1.20.1
(installer sees correct versions oftorch
& CUDA).Appreciate any help you guys can give!
Sysinfo
Relevant console log
System Information
ComfyUI Version: v0.3.4-1-g839ed33
Arguments: main.py
OS: posix
Python Version: 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0]
Embedded Python: false
PyTorch Version: 2.5.1+cu124
Devices
Name: cuda:0 NVIDIA RTX 6000 Ada Generation : cudaMallocAsync
Logs
Attached Workflow
Please make sure that workflow does not contain any sensitive information such as API keys or passwords.
Additional Context
(Please add any additional context or steps to reproduce the error here)