openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
6.82k stars 2.17k forks source link

Inquiry About OpenVINO Conversion Best Practices #25934

Closed QuyLe-Minh closed 2 weeks ago

QuyLe-Minh commented 1 month ago

Hi,

I wanted to express my gratitude for your excellent work. Thanks to your contributions, my model can now run in real-time with just a few additional lines of code. However, I have a question about how this particular setup works.

Suppose I have a "super model" that consists of multiple components. What is the best practice for converting it to OpenVINO? Should I convert each component separately, or can I convert the entire "super model" as a whole?

Additionally, I've noticed that when I convert the "super model" and then run inference, sometimes OpenVINO performs slower than using just the CPU. However, when I use ONNX as an intermediate step, the performance improves significantly. Could you provide some insights into this?

Best regards,

andrei-kochin commented 1 month ago

@QuyLe-Minh thank you for reaching the OpenVINO!

Unfortunately it is very hard to help you as you've hidden the details with high level of abstractions.

If I get you right in general for the pipeline models we recommend to use the https://github.com/huggingface/optimum-intel to have the model converted and later on it can be inferred using the https://github.com/openvinotoolkit/openvino.genai pipelines or continue to optimum.

If you could be more specific then I can give you better advises.

Best regards, Andrei

QuyLe-Minh commented 1 month ago

Hi,

Thanks for reaching out. I've developed a large model in PyTorch that includes multiple sub-models, such as image enhancement, landmark detection, headpose detection, optical flow and vice versa. When the main model is called, each sub-model is sequentially executed, with the output of one serving as the input for the next.

Could you provide advice on how to convert this model? I refered the links you provided, but they don't seem to address my specific needs.

Thanks!!

eaidova commented 1 month ago

@QuyLe-Minh what do you mean under: when I use ONNX as an intermediate step, the performance improves significantly do you convert model to onnx, then to openvino (or pass onnx model to openvino) or use some different runtime for that? Which command used for export model to onnx and openvino on your side? Do you specify dynamic axes during conversion model to onnx? If not, I guess, the main difference between model converted to openvino directly from pytorch and onnx in dynamic vs static input shapes, you can try to specify desired input shapes using input parameter for convert_model to get openvino model with static shapes, that may improve inference preformance

andrei-kochin commented 4 weeks ago

@QuyLe-Minh any updates for us on the above query?

QuyLe-Minh commented 4 weeks ago

Oh, sorry, I almost forgot to reply. I converted a model to ONNX and then used that ONNX file to convert it to OpenVINO. In other words, I passed the ONNX model to OpenVINO, and it significantly reduced the runtime. I also specified the desired input shapes in all of the cases and this approach seems to return the best performance.

As for the large model I mentioned earlier, I believe I've managed it well through some experiments :)