Open jinchenglee opened 2 months ago
I think the likely source of the error is the use of a custom operation for which the output is unshaped.
%455 = "onnx.Custom"(%454, %262, %261, %265, %266, %267, %264, %263) {domain_name = "com.microsoft", function_name = "QLinearAdd", onnx_node_name = "Gemm_260_Add_quant"} : (tensor<1x1000xui8>, tensor<f32>, tensor<ui8>, tensor<1000xui8>, tensor<f32>, tensor<ui8>, tensor<f32>, tensor<ui8>) -> tensor<*xui8>
ONNX MLIR needs to know the output shape. In this specific case, since it's all static, I suspect that filling in the output shape would enable it to work further down. We don't have support for the QLinearAdd as it is not an official op but a MS extension. So it would choke later when attempting to lower that operation.
Best would be to recommend ONNX to add that operation and then we could implement it as part of the standard.
Thank you for explanation. Since custom operator is an officially supported operator in ONNX spec, it seems a good practice to add some level of support in onnx-mlir? Of course, it won't be able to generate any runnable-code since it is unknow the actual function of the customized op. But if the output shape is already there (static model), at least some layers of passes should still work?
I know quantized model isn't supported yet. I would like to confirm this is the symptom due to Dequantize/Quantize operators?
The model I'm testing is ShuffleNet-v2-int8 from here.
Command line:
onnx-mlir --mlir-pass-statistics --mlir-print-ir-after-all --EmitLLVMIR ~/shufflenet-v2-12-int8.onnx
Error: