huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.59k stars 473 forks source link

model.ByteSize() is negative when converting microsoft/phi2 model #1642

Open guotuofeng opened 10 months ago

guotuofeng commented 10 months ago

System Info

* optimum: 1.16.1
* Windows amd64
* Python  3.8.18
* onnxruntime nightly build
* onnx 1.15.0
* protobuf 3.20.3
* torch  2.1.2

Who can help?

No response

Information

Tasks

Reproduction (minimal, reproducible, runnable)

run the following command:

python -m optimum.exporters.onnx -m microsoft/phi-2 --library-name transformers .

The following errors will be raised. image

I made one line print in graph_transformations.py. image

Expected behavior

We might need add the following checks? image

Just not sure why the model.ByteSizes() return -1765569341

fxmarty commented 10 months ago

Thank you, related to https://github.com/huggingface/optimum/issues/1044 & https://github.com/microsoft/Olive/blob/697948c2a1f7fe938609e1c97060d17f255c322e/olive/passes/onnx/optimum_merging.py#L44-L49

This is a bug in ModelProto.ByteSize() on Windows only.

As a workaround, can you try: python -m optimum.exporters.onnx -m microsoft/phi-2 --library-name transformers . --no-post-process

It would be great if you can open an issue at https://github.com/onnx/onnx sharing the onnx model there, and with a small reproduction like

import onnx

model = onnx.load(model_path)

print(model.ByteSizes())
guotuofeng commented 10 months ago

Thanks for the info. Just create the issue in ONNX repo.

xadupre commented 10 months ago

How do you use ByteSize()? Maybe we can implement a function which returns the result you build with it. I don't think protobuf will update its API since it is not meant to support models bigger than 2Gb.

The other option is to export the model with external weights enabled. A new API https://onnx.ai/onnx/api/model_container.html was introduced to make it easier to build such model with external weights without serialization of the weights. That would be the direction I would recommend.