mT5 convert to ONNX and GPU inference problems

shiqingzhangCSU commented 1 year ago

Describe the issue

I am testing model inference for mT5. Using convert_generation.py, I converted mt5 to an onnx mt5 beamsearch model and I got this infomation. mt5 get a parity of e-2, it is a bit hight for the text generation result.

To reproduce

my mT5 modle ： https://huggingface.co/ClueAI/ChatYuan-large-v1 transformers==4.25.1 python 3.7

Urgency

No response

Platform

Linux

OS Version

7.4

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.14.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 11.3

wangyems commented 1 year ago

hi @shiqingzhangCSU, this is a known issue. We'll take a look.

shiqingzhangCSU commented 1 year ago

Thanks！

shiqingzhangCSU commented 1 year ago

hi @shiqingzhangCSU, this is a known issue. We'll take a look.

Is there any progress, please?

wangyems commented 1 year ago

Currently there's no update yet

ZavierXing commented 1 year ago

t5_decoder.py 代码和transformers里面代码有diff，需要修改一下

shiqingzhangCSU commented 1 year ago

t5_decoder.py 代码和transformers里面代码有diff，需要修改一下

大佬搞成功了？改完这些就行了嘛？

microsoft / onnxruntime