Closed aofengdaxia closed 1 month ago
cuda 11.8 onnxruntime 1.18.1 cuDNN 8.9.5
It is the bug of onnxruntime. To verify the bug of gpu, you could infer a wav for 10 times. You would get the inference time:
1st: long time 2nd: short time 3rd: short ...
For a new wav input, it is the 1st time.
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
🐛 Bug
When I use onnxruntime-GPU to inference, It's slowly then CPU.
To Reproduce
Steps to reproduce the behavior (always include the command you ran):
GPU takes 2800ms,but cpu only task 128ms to inference.
Code sample
Expected behavior
Environment