huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.59k stars 472 forks source link

add support for nllb #354

Open hust-kevin opened 2 years ago

hust-kevin commented 2 years ago

Feature request

add support for convert nllb to onnx format

Motivation

I want to convert nllb to onnx, I use ORTModelForSeq2SeqLM, but get error

onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Deserialize tensor onnx::MatMul_3635 failed.tensorprotoutils.cc:637 TensorProtoToTensor External initializer: onnx::MatMul_3635 offset: 0 size to read: 16777216 given file_length: 4194304 are out of bounds or can not be read in full.

Your contribution

None

caffeinetoomuch commented 2 years ago

I was getting the similar error when I was trying to load google/long-t5-tglobal-xl using ORTModelForSeq2SeqLM. From my experience, I am seeing these errors for larger models.

nickchomey commented 1 year ago

@hust-kevin did you ever succeed with this?

fxmarty commented 1 year ago

@nickchomey Do you have the same issue? I can try and have a look shortly.

nickchomey commented 1 year ago

@fxmarty I actually haven't tried yet. I was just browsing around for info related to using HF Optimum to convert NLLB to ONNX, and then optimize and quantize it. Then I'd like to compare its performance with CTranslate2 (NLLB-200 with CTranslate2) and hopefully eliminate a dependency.

I assume you have everything already set up to do an easy check, so it would be much appreciated if you could do so!