iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.56k stars 571 forks source link

[runtime]: Numeric error due to QuantizeLinear/DequantizeLinear #18200

Open pdhirajkumarprasad opened 1 month ago

pdhirajkumarprasad commented 1 month ago

What happened?

For the given MLIR

module {
  func.func @main(%arg0: !torch.vtensor<[1,3,513,513],f32>, %arg1: !torch.vtensor<[32,3,3,3],si8>, %arg2: !torch.vtensor<[32,1,3,3],si8>) -> !torch.vtensor<[1,32,257,257],f32> attributes {torch.onnx_meta.ir_version = 10 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.producer_name = "", torch.onnx_meta.producer_version = ""} {
    %none = torch.constant.none
    %0 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %1 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<3.125000e-02> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %2 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<5.000000e-01> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %3 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<6.250000e-02> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %4 = torch.operator "onnx.QuantizeLinear"(%arg0, %3, %0) : (!torch.vtensor<[1,3,513,513],f32>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,3,513,513],si8> 
    %5 = torch.operator "onnx.DequantizeLinear"(%4, %3, %0) : (!torch.vtensor<[1,3,513,513],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,3,513,513],f32> 
    %6 = torch.operator "onnx.DequantizeLinear"(%arg1, %1, %0) : (!torch.vtensor<[32,3,3,3],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[32,3,3,3],f32> 
    %7 = torch.operator "onnx.Conv"(%5, %6) {torch.onnx.group = 1 : si64, torch.onnx.kernel_shape = [3 : si64, 3 : si64], torch.onnx.pads = [1 : si64, 1 : si64, 1 : si64, 1 : si64], torch.onnx.strides = [2 : si64, 2 : si64]} : (!torch.vtensor<[1,3,513,513],f32>, !torch.vtensor<[32,3,3,3],f32>) -> !torch.vtensor<[1,32,257,257],f32> 
    %8 = torch.operator "onnx.QuantizeLinear"(%7, %3, %0) : (!torch.vtensor<[1,32,257,257],f32>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,32,257,257],si8> 
    %9 = torch.operator "onnx.DequantizeLinear"(%8, %3, %0) : (!torch.vtensor<[1,32,257,257],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,32,257,257],f32> 
    %10 = torch.operator "onnx.DequantizeLinear"(%arg2, %2, %0) : (!torch.vtensor<[32,1,3,3],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[32,1,3,3],f32> 
    %11 = torch.operator "onnx.Conv"(%9, %10) {torch.onnx.group = 32 : si64, torch.onnx.kernel_shape = [3 : si64, 3 : si64], torch.onnx.pads = [1 : si64, 1 : si64, 1 : si64, 1 : si64], torch.onnx.strides = [1 : si64, 1 : si64]} : (!torch.vtensor<[1,32,257,257],f32>, !torch.vtensor<[32,1,3,3],f32>) -> !torch.vtensor<[1,32,257,257],f32> 
    %12 = torch.operator "onnx.QuantizeLinear"(%11, %3, %0) : (!torch.vtensor<[1,32,257,257],f32>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,32,257,257],si8> 
    %13 = torch.operator "onnx.DequantizeLinear"(%12, %3, %0) : (!torch.vtensor<[1,32,257,257],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,32,257,257],f32> 
    return %13 : !torch.vtensor<[1,32,257,257],f32>
  }
}

Seeing failure at runtime with following mismatch

[FAILED] result[0]: element at index 272 (2.3125) does not match the expected (2.25); expected that the view is equal to contents of a view of 1x32x257x257xf32

golden_output.0.bin.txt input.0.bin.txt input.1.bin.txt input.2.bin.txt

Steps to reproduce your issue

Steps to reproduce:

iree-compile model.torch_onnx.mlir --iree-hal-target-backends=llvm-cpu -o out.vmfb
iree-run-module --module=out.vmfb --device="local-task" --input="1x3x513x513xf32=@input.0.bin" --input="32x3x3x3xsi8=@input.1.bin" --input="32x1x3x3xsi8=@input.2.bin" --expected_output="1x32x257x257xf32=@golden_output.0.bin"

IREE Version:

IREE compiler version 20240819.990 @ aeda14995f16ed1302db616adf0c03acf80f27ee LLVM version 20.0.0git

What component(s) does this issue relate to?

Runtime

Version information

No response

Additional context

No response

pdhirajkumarprasad commented 1 month ago

We have similar issue for FCN_vaiq_int8 so once this is fixed, need to try. Not able to upload the file due to size.

ScottTodd commented 1 month ago

When I last checked, DequantizeLinear never worked. The unit tests for that op showed signs of a miscompile: https://github.com/iree-org/iree/issues/16666