ONNX inference is slower than pt

jiaqi71 commented 1 year ago

export yolov7.pt to onnx. however the inference time by onnx model is about twice the time of yolov7.pt pytorch1.9.1 + cu10.2 linux cuda yolov7.pt inference time: 0.012s cpu yolov7.pt inference time:0.131s cpu yolov7.onnx inference time: 0.267s

the accuracy between cuda yolov7.pt and cpu yolov7.onnx is same. cpu yolov7.pt is a little different

but the inference time in my mac seems correct: yolov7.pt inference time: 0.537s yolov7.onnx inference time: 0.243s

I cannot figure out what's wrong with my export? Could anyone give me an answer?

chmjelek commented 1 year ago

I have the same problem. I'm using: YOLOR 🚀 v0.1-115-g072f76c torch 1.12.1+cu102 CPU

YOLOv7.pt - inference time 235ms YOLOv7.onnx - inference time 366ms

YOLOv7-D6.pt - inference time 1265ms YOLOv7-D6.onnx - inference time 2441ms

Shouldn't onnx on the CPU be faster?

jersonal commented 1 year ago

Do you know where to find the inference code for an ONNX model? Would you mind sharing it with me?

chmjelek commented 1 year ago

Check out export section and Pytorch to ONNX with NMS (and inference).

WongKinYiu / yolov7

ONNX inference is slower than pt #859