iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.56k stars 571 forks source link

[CI][XFAIL] test_shape_end_negative_1 - onnx.Shape compilation failure #18181

Open PhaneeshB opened 1 month ago

PhaneeshB commented 1 month ago

What happened?

For Torch-mlir bump in IREE #18169 Compilation failure for onnx.Shape operator for CPU, GPU AMD (rocm + vulkan) and GPU Nvidia (cuda + vulkan)

CPU Error Log:

_ IREE compile and run: test_shape_end_negative_1::model.mlir::model.mlir::cpu_llvm_sync _
[gw1] linux -- Python 3.11.9 /home/runner/work/iree/iree/venv/bin/python
Error invoking iree-compile
Error code: 1
Stderr diagnostics:
model.mlir:4:10: error: failed to materialize conversion for result #0 of operation 'torch.operator' that remained live after conversion
    %0 = torch.operator "onnx.Shape"(%arg0) {torch.onnx.end = -1 : si64} : (!torch.vtensor<[3,4,5],f32>) -> !torch.vtensor<[2],si64> 
         ^
model.mlir:4:10: note: see current operation: %2 = "torch.operator"(%arg0) <{name = "onnx.Shape"}> {torch.onnx.end = -1 : si64} : (!torch.vtensor<[3,4,5],f32>) -> !torch.vtensor<[2],si64>
model.mlir:5:5: note: see existing live user here: func.return %1 : !torch.vtensor<[2],si64>
    return %0 : !torch.vtensor<[2],si64>
    ^

Invoked with:
  cd /home/runner/work/iree/iree/SHARK-TestSuite/iree_tests/onnx/node/generated/test_shape_end_negative_1 && iree-compile model.mlir --iree-hal-target-backends=llvm-cpu --iree-input-demote-f64-to-f32=false -o model_cpu_llvm_sync_.vmfb
  module {
  func.func @test_shape_end_negative_1(%arg0: !torch.vtensor<[3,4,5],f32>) -> !torch.vtensor<[2],si64> attributes {torch.onnx_meta.ir_version = 10 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
    %none = torch.constant.none
    %0 = torch.operator "onnx.Shape"(%arg0) {torch.onnx.end = -1 : si64} : (!torch.vtensor<[3,4,5],f32>) -> !torch.vtensor<[2],si64>
    return %0 : !torch.vtensor<[2],si64>
    }
}

Compile command :

 iree-compile model.mlir --iree-hal-target-backends=llvm-cpu --iree-input-demote-f64-to-f32=false -o model_cpu_llvm_sync_.vmfb

Steps to reproduce your issue

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

What component(s) does this issue relate to?

No response

Version information

No response

Additional context

No response

IanWood1 commented 1 month ago

~Looks like torch-mlir doesn't support onnx's start or end arguments for slicing the shape.~

Edit: I was looking at an outdated version

ScottTodd commented 1 month ago

This may be related to https://github.com/iree-org/iree/issues/16814

IanWood1 commented 1 month ago

This may be related to #16814

Yeah, I think they are the same

The onnx documentation says end=-1 results in truncating the last dim, but the logic in torch-mlir treats end=-1 as the default.

https://github.com/llvm/torch-mlir/blob/026dfade6406035bca7481071e2263e849f83b09/lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp#L1669-L1672