Closed DeepTrial closed 9 months ago
I've supported 5 quantized model here: config.
Can you provide more details about the failure cases of the quantized model?
Anyway you're welcome to contribute!
I test with resnet50 with int8 quantized by onnxruntime static quantizaion. It shows that QLinearLeakyRelu/QLinearAdd/QGemm have not supported yet.
Hi, I've checked that QLinearAdd, QGemm, and QLinearLeakyRelu are operators of onnxruntime's contrib_ops, which means that they are not visible for all models. The model with these operators is only supported by onnxruntime. It violates onnx_tool's design.
But you are also welcome to contribute these ops. But I request you group these Operators as onnxruntime. You can create a new ort_node.py file or create a new node register for ORT nodes. As these names can be conflicted between different inference engines.
Thanks. Actually these operators belongs to 'com.microsoft' domain, which is not exists in onnx spec. I will consider about how to deal with these operators
I find this project is very useful and I have integrated it into my daily work. I wonder if you have any plans to support quantized onnx model (quantized by onnxruntime)? If not, I am pleasure to make contribution.