chainyo / transformers-pipeline-onnx

How to export Hugging Face's 🤗 NLP Transformers models to ONNX and use the exported model with the appropriate Transformers pipeline.
23 stars 0 forks source link

GPT2 text generation pipeline #1

Closed C00reNUT closed 1 year ago

C00reNUT commented 2 years ago

Hello, thank you for this tutorial, I have tried to modify the code in order to use the text generation pipeline with gpt2 model. The problem is that the performance of vanilla Pytorch is better than ONNX optimized models. This is true for my home setup and also on colab pro with T4 and P100 GPUs.

image

I have also tried text generation pipeline in https://github.com/AlekseyKorshuk/optimum-transformers library, but the results are similar - the ONNX performance is still slower.

Do you have any idea what could be the problem?

chainyo commented 2 years ago

Do you have any idea what could be the problem?

Hello thanks for the feedback!

I already read that kind of thing, it could come from onnx runtime library which doesn't optimize well for gpt models.

Check this issue which is related to T5 but i think it applies to gpt models: https://github.com/microsoft/onnxruntime/issues/6835

Take also a look to this: https://github.com/microsoft/onnxruntime/issues/11293