Closed antithing closed 2 years ago
windows下的情况不太熟悉,你也可以忽略这个选项。在 ort_defs.h 中尝试前行定义 USE_CUDA 宏后再编译。
// 原来是
#ifdef ENABLE_ONNXRUNTIME_CUDA
# define USE_CUDA
#endif
// 修改成
# define USE_CUDA
Hi! thanks for getting back to me. This has the same result. around 7 seconds a frame, and no GPU usage at all. Is there anything else i can try?
This doc may help ~ 👇
Hi, I am trying to run the Deeplab model for segmentation on Windows.
I have tried the ORT binaries, and also built it from source (including CUDA and cuDNN). ENABLE_ONYXRUNTIME_CUDA is set.
I run the lite_deeplabv3_resnet101 example for ten images in a loop, and it takes 7 seconds for each frame on my machine (RTX 3090 GPU).
Looking at the task manager, the GPU is not used at all.
Am I missing something? What can I do to use CUDA with your code?
Thank you!