Open Matthieu-Tinycoaching opened 1 year ago
Hi @Matthieu-Tinycoaching thanks for the report. Actually, the error message is a bit misleading, it rather means that this architecture should be added in https://github.com/huggingface/optimum/blob/4ea4baa77f8030a83157c0e6abbd750e61ad45da/optimum/onnxruntime/utils.py#L97. I can fix shortly!
@Matthieu-Tinycoaching @fxmarty Does optimization fp16 work for flan-t5 on GPU?
@fxmarty It seems that this issue has been reported as fixed in the onnxruntime repo: https://github.com/microsoft/onnxruntime/issues/14886
Is anything further required to enable optimization for Flan-T5-Large?
Feature request
Would it be possible to add GPU graph optimizations for Flan-T5-Large model? (request also at https://github.com/microsoft/onnxruntime/issues/14886)
Actually, after having exported the model to ONNX and trying to optimize it with ORTOptimizer as below:
I got the following error message:
Motivation
Optimize performance (latency/throughput) of Flan-T5-Large model
Your contribution
I could beta test the solution.