Optimising ONNX Graph either takes too long or doesn't seem to work

ELS-RD / transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀

https://els-rd.github.io/transformer-deploy/

Apache License 2.0

1.65k stars 150 forks source link

Optimising ONNX Graph either takes too long or doesn't seem to work #109

Open accountForIssues opened 2 years ago

accountForIssues commented 2 years ago

Using the GPT2 Notebook, I am trying to convert a gpt2 model to an optimised ONNX graph and I'm stuck at what seems to be random behaviour.

The export to ONNX works fine. However, while optimising the ONNX graph, I usually see warnings similar to this: WARNING:symbolic_shape_infer:Cannot determine if Reshape_560_o0__d1 - sequence < 0 over and over again until I have to stop the kernel.

It did work once or twice (in the same environment) and it took about 30 seconds so I have no idea what changed.

I barely even changed the code. I'm just following the notebook.

What does the warning mean and how can I go back to a stable optimisation ?

pommedeterresautee commented 2 years ago

Which version of PyTorch are you using ?

accountForIssues commented 2 years ago

I've tried with both 1.11 and 1.12.

Is there a recommended way or guide to use this library along with setting up an environment ? Maybe there is a package conflict somewhere I'm overlooking.

pommedeterresautee commented 2 years ago

I just rerun the notebook and had no issue. I imagine it's a dependency version thing. The ones I would check would be those related to ONNX and Pytorch as they are the only 2 things related to ONNX graph.

❯ pip list | grep onnx
onnx                      1.12.0
onnx-graphsurgeon         0.3.19
onnxconverter-common      1.9.0
onnxruntime-gpu           1.12.0
onnxruntime-tools         1.7.0
tf2onnx                   1.11.1
❯ pip list | grep torch
pytorch-quantization      2.1.2
torch                     1.11.0+cu113

accountForIssues commented 2 years ago

Maybe solved.

I created a new docker image using the latest cuda runtime and installed each package.

I can confirm that the latest torch causes this issue but I do remember when I used an older torch image that I got this error as well so I definitely think there's another package that could be causing this issue.

In any case, I will keep testing to see out if it breaks again. Hopefully, you come across this as well when you update the docker image and solve it :)