Closed mnurilmi closed 2 years ago
it's solved... maybe at first time, onnxruntime needs to allocate memory and these steps take a lot of time. So cuda execution provider is faster than cpu on large batch. Now, i wanna enable mixed precision, maybe it give shorter inference time
Hi, I want to ask about the onnx model in onnxruntime. I've tried the onnx model MGN model and the results are surprising. When I use CPUExecution provider it is faster than CUDAExecution provider. It should be the other way around in theory. Can you explain it? maybe there are some steps that i don't know...thanks for your attention, hopefully this issue can be responded