ELS-RD / transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
https://els-rd.github.io/transformer-deploy/
Apache License 2.0
1.65k stars 150 forks source link

[Question] Converting models over 2GB #138

Closed wkkautas closed 2 years ago

wkkautas commented 2 years ago

Thank you for the excellent work!

When using convert_model command in ghcr.io/els-rd/transformer-deploy:0.5.1 to convert models over 2GB, the following error occurs.

[09/13/2022-08:29:13] [TRT] [E] onnx2trt_utils.cpp:741: Size mismatch when importing initializer: roberta.embeddings.word_embeddings.weight. Expected size: 0 , actual size: 1024008192

Though editing following line to set load_external_data=True solves this error, is it right workaround?

https://github.com/ELS-RD/transformer-deploy/blob/cc781dbe925cccdc309e0a96501dc20b979b4627/src/transformer_deploy/backends/pytorch_utils.py#L168

Thanks,

ayoub-louati commented 2 years ago

@wkkautas Yes, that's what we should do for large models > 2 Gb (we should set the load_external_data to True). Thank you for this issue.

ayoub-louati commented 2 years ago

@wkkautas can you please put the name of the model that you were testing and you got this error ?

wkkautas commented 2 years ago

Thank you for the confirmation and fix! I was using xlm-roberta-large.