Fail to convert a simple network with QuantizeLinear/DequantizeLinear

zhangliliang commented 4 years ago

I try to convert a simple network with QuantizeLinear/DequantizeLinear but it fails.

The network is very simple that it only contains one tensor as input. Its definition could be found in test of pytorch (https://github.com/pytorch/pytorch/blob/master/test/onnx/model_defs/op_test.py).

class FakeQuantNet(nn.Module):
    def __init__(self):
        super(FakeQuantNet, self).__init__()
        self.fake_quant = torch.quantization.FakeQuantize()
        self.fake_quant.disable_observer()

    def forward(self, x):
        output = self.fake_quant(x)
        return output

The onnx model could be generated via torch.onnx.export(toC(FakeQuantNet()), toC(x), "fake_quant_net.onnx", export_params=True, opset_version=10, input_names=['input'], output_names=['output']). And its visualization using Netron could be like, Screenshot from 2020-08-02 10-59-28

And the onnx file is uploaded as fake_quant_net.zip

The command for converting is ./onnx-mlir --EmitONNXBasic /home/zhangll/liliang-learning-home/pytorch/test/onnx/fake_quant_net.onnx, and the error message is

Failed to import ONNX TensorProto due to unsupported data types.
UNREACHABLE executed at /build/onnx-mlir/src/Builder/FrontendDialectHelper.cpp:188!
Aborted (core dumped)

Could someone give some suggestion to solve it?

tjingrant commented 4 years ago

@zhangliliang thanks for checking us out, we do not support any quantization related facilities yet although this is something that we hope to cover eventually. We are also open to collaboration if you are interested!

zhangliliang commented 4 years ago

@tjingrant Thanks for your reply.

Could you share some plans on the roadmap of onnx-mlir?

tjingrant commented 4 years ago

@zhangliliang the objective of this project is to provide production-ready and research-friendly infrastructure for anyone interested in building their own deep learning oriented compiler software stack.

The immediate implication is that we are trying to provide as many compiler/low-level IR definitions of ONNX operators as possible. Meanwhile, we are also actively engineering/researching a few systematic/generic methods of optimization:

We are exploring a hopefully more natural/succinct way to represent algorithm/optimization semantics that is suitable for finding globally optimal compilation strategies.
We are implementing memory saving optimization to conserve memory usage.
We are trying to optimize the end-to-end performance for models involving heavy data-preprocessing steps by supporting common data-preprocessing methods and/or machine learning operations in our compiler infra.
Ultimately our goal is to support a variety of IBM hardware platforms (some may be mature, others may be highly experimental), and any lessons we learned in the process, we are happy to try making the solutions publicly available via this onnx-mlir project to benefit the larger community.

muzafferkal commented 1 year ago

Hi, Has there been any change in the area of support for QuantizeLinear & DequantizeLinear operators? I'm generating a quantized ONNX model and I'd like to convert them to mlir.

onnx / onnx-mlir

Fail to convert a simple network with QuantizeLinear/DequantizeLinear #246