Any plan or action to support quantizaed model?

ThanatosShinji / onnx-tool

A parser, editor and profiler tool for ONNX models.

https://pypi.org/project/onnx-tool/

MIT License

399 stars 52 forks source link

Any plan or action to support quantizaed model? #67

Closed DeepTrial closed 9 months ago

DeepTrial commented 9 months ago

I find this project is very useful and I have integrated it into my daily work. I wonder if you have any plans to support quantized onnx model (quantized by onnxruntime)? If not, I am pleasure to make contribution.

ThanatosShinji commented 9 months ago

I've supported 5 quantized model here: config.

Can you provide more details about the failure cases of the quantized model?

Anyway you're welcome to contribute!

DeepTrial commented 9 months ago

I test with resnet50 with int8 quantized by onnxruntime static quantizaion. It shows that QLinearLeakyRelu/QLinearAdd/QGemm have not supported yet.

ThanatosShinji commented 9 months ago

Hi, I've checked that QLinearAdd, QGemm, and QLinearLeakyRelu are operators of onnxruntime's contrib_ops, which means that they are not visible for all models. The model with these operators is only supported by onnxruntime. It violates onnx_tool's design.

But you are also welcome to contribute these ops. But I request you group these Operators as onnxruntime. You can create a new ort_node.py file or create a new node register for ORT nodes. As these names can be conflicted between different inference engines.

DeepTrial commented 9 months ago

Thanks. Actually these operators belongs to 'com.microsoft' domain, which is not exists in onnx spec. I will consider about how to deal with these operators