huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.86k stars 27.19k forks source link

Onnx Runtime Errors With LongT5 #18243

Open reelmath opened 2 years ago

reelmath commented 2 years ago

System Info

Who can help?

@stancld @echarlaix @LysandreJik

Information

Tasks

Reproduction

LongT5 with TGlobal Attention isn't able to run sequences longer than *global_block_size 2**. This is because during the model tracing num_globals > 0 is being converted to False. I originally posted the error in Optimum (https://github.com/huggingface/optimum/issues/285) but @echarlaix asked me to open an issue here because this error concerns the ONNX export.

Code to reproduce is below:

!pip install transformers
!pip install transformers[onnx]
!python -m pip install git+https://github.com/huggingface/optimum.git
!python -m pip install git+[https://github.com/huggingface/optimum.git#egg=optimum[onnxruntime]](https://github.com/huggingface/optimum.git#egg=optimum%5Bonnxruntime%5D)
!pip install datasets
from optimum.onnxruntime import ORTModelForSeq2SeqLM

model = ORTModelForSeq2SeqLM.from_pretrained("longt5-tglobal-base", from_transformers=True)
from transformers import AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained('google/long-t5-tglobal-base')

onnx_summarization = pipeline("summarization", model=model, tokenizer=tokenizer)

text = # Something longer than 32 tokens if I don't change the number of global blocks
pred = onnx_summarization(text)`
RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running LessOrEqual node. Name:'LessOrEqual_648' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:603 onnxruntime::Broadcaster::Broadcaster(gsl::span, gsl::span) largest <= 1 was false. Can broadcast 0 by 0 or 1. 16 is invalid.

Expected behavior

Should work for very large seq lens on default global block size without error

LysandreJik commented 2 years ago

Hey @reelmath, thanks for opening an issue, it seems you and @echarlaix managed to find the source of the problem.

We unfortunately don't have a lot of bandwidth to dive into solving that code, so I'll add an onnx tag and a Good second issue tag so that experienced users know that this is an issue that could be fixed. If you'd like to try your hand at it, please go ahead!

yhl48 commented 2 years ago

Hi, I would like to work on this if it has not been assigned to anyone, but could take some time if that is ok?

patrickvonplaten commented 2 years ago

Hey @yhl48, this would be great indeed :-)

JuheonChu commented 2 years ago

Hello @reelmath , I was trying to mimic your error with my setting as follows:

but I faced the same errors with you.

JuheonChu commented 2 years ago

abcds

yhl48 commented 1 year ago

It looks like the pretrained model is not available anymore?

Upon running the following line

model = ORTModelForSeq2SeqLM.from_pretrained("longt5-tglobal-base", from_transformers=True)

The following error was raised

OSError: longt5-tglobal-base is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.
stancld commented 1 year ago

@yhl48 I think you need to use google/long-t5-tglobal-base name

yhl48 commented 1 year ago

Thanks @stancld!

Has this issue been resolved? I can no longer replicate the error.