microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.37k stars 2.88k forks source link

Inference speed: Swintransformer torch vs onnxruntime-gpu #13550

Open shileims opened 1 year ago

shileims commented 1 year ago

Describe the issue

I have tested a swintransformer model with torch and onnxruntime-gpu. I found onnxruntime-gpu has no speed advantages over inference based on the torch model.

To reproduce

  1. Get pytorch model: https://huggingface.co/docs/transformers/model_doc/swin
  2. Convert pytorch model to onnx: https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html
  3. Do inference to check the speed for torch and onnx(onnxruntime-gpu)

Urgency

Yes

Platform

Linux

OS Version

18.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.13.1

ONNX Runtime API

Python

Architecture

Other / Unknown

Execution Provider

CUDA

Execution Provider Library Version

No response

Chaoran-F commented 1 year ago

I have some problems with matmul node and some problems with cublas fault 14, when i use onnxruntime with cuda to infer swin-t.onnx, but if i use cpu with onnxruntime, no error , do you have any idea