Open VikasOjha666 opened 2 years ago
Will add to our backlog.
Hello,
After using hugging face optimum
, I found that it will soon be possible to do some seq2seq (https://github.com/huggingface/optimum/pull/199). It works great without optimization, but to optimize T5 models we would need an onnxruntime/transformers/onnx_model_XXX.py
which is missing the T5 one.
(More details on their forum: https://discuss.huggingface.co/t/optimum-t5-for-inference/16695/5)
Do you have any status about that? I would be able to spend some time on that if needed.
Thanks in advance, Have a great day
@Ierezell, optimization of T5 model are planned (likely in 1.13 release). Contributions are welcome.
did this happen? i'm still seeing this message
KeyError: "ONNX Runtime doesn't support the graph optimization of t5 yet. Only ['bert', 'gpt2', 'bart'] are supported. If you want to support t5 please propose a PR or open up an issue in ONNX Runtime:https://github.com/microsoft/onnxruntime."
@wangyems, could you give update of T5 optimizations?
Can anyone tell me the progress of the inclusion of T5 into ORTOptimizer/ ORTQuantizer ?
The T5 optimizer is completed. Try the following to generate an optimized fp16 model:
python -m onnxruntime.transformers.t5.convert_to_onnx -m t5-small --output ./onnx -o --use_gpu -p fp16
You can also try beam search optimization with T5:
python -m onnxruntime.transformers.convert_generation -m t5-small --model_type t5 --output t5_small_beam_search.onnx --use_gpu --past_present_share_buffer --use_decoder_masked_attention
Is your feature request related to a problem? Please describe. No, it's not a problem but a feature request
System information
Describe the solution you'd like As far as now the layer fusion-based optimization is available for BERT, GPT2, BART, etc. But it's not available for T5. Hence it would be good if it was implemented for T5 as well because the T5 model is getting quite popular.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.