Failure to lower std.dim operations in shape math

GMNGeoffrey commented 3 years ago

Describe the bug IREE

To Reproduce

$ cat > repro.mlir <<EOF
module  {
  func @is_finite__nan_and_inf(%arg0: tensor<?x?xf32> {tf._user_specified_name = "args_0"}) -> tensor<?x?xi1> attributes {iree.module.export, iree.reflection = {abi = "sip", abiv = 1 : i32, sip = "I8!S5!k0_0R3!_0"}, tf._construction_context = "kEagerRuntime", tf._input_shapes = [#tf.shape<?x?>]} {
    %0 = mhlo.constant dense<0x7F800000> : tensor<f32>
    %1 = shapex.get_ranked_shape %arg0 : tensor<?x?xf32> -> !shapex.ranked_shape<[?,?]>
    %2 = shapex.get_ranked_shape %arg0 : tensor<?x?xf32> -> !shapex.ranked_shape<[?,?]>
    %3 = "shapex.ranked_broadcast_shape"(%1, %2) {lhs_broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>, rhs_broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (!shapex.ranked_shape<[?,?]>, !shapex.ranked_shape<[?,?]>) -> !shapex.ranked_shape<[?,?]>
    %4 = "shapex.ranked_broadcast_in_dim"(%arg0, %3) {broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (tensor<?x?xf32>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xf32>
    %5 = "shapex.ranked_broadcast_in_dim"(%arg0, %3) {broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (tensor<?x?xf32>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xf32>
    %6 = "mhlo.compare"(%4, %5) {comparison_direction = "EQ"} : (tensor<?x?xf32>, tensor<?x?xf32>) -> tensor<?x?xi1>
    %7 = "mhlo.abs"(%arg0) : (tensor<?x?xf32>) -> tensor<?x?xf32>
    %8 = shapex.get_ranked_shape %7 : tensor<?x?xf32> -> !shapex.ranked_shape<[?,?]>
    %9 = "shapex.ranked_broadcast_in_dim"(%7, %8) {broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (tensor<?x?xf32>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xf32>
    %10 = "shapex.ranked_broadcast_in_dim"(%0, %8) {broadcast_dimensions = dense<> : tensor<0xi64>} : (tensor<f32>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xf32>
    %11 = "mhlo.compare"(%9, %10) {comparison_direction = "NE"} : (tensor<?x?xf32>, tensor<?x?xf32>) -> tensor<?x?xi1>
    %12 = shapex.get_ranked_shape %6 : tensor<?x?xi1> -> !shapex.ranked_shape<[?,?]>
    %13 = shapex.get_ranked_shape %11 : tensor<?x?xi1> -> !shapex.ranked_shape<[?,?]>
    %14 = "shapex.ranked_broadcast_shape"(%12, %13) {lhs_broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>, rhs_broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (!shapex.ranked_shape<[?,?]>, !shapex.ranked_shape<[?,?]>) -> !shapex.ranked_shape<[?,?]>
    %15 = "shapex.ranked_broadcast_in_dim"(%6, %14) {broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xi1>
    %16 = "shapex.ranked_broadcast_in_dim"(%11, %14) {broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xi1>
    %17 = mhlo.and %15, %16 : tensor<?x?xi1>
    return %17 : tensor<?x?xi1>
  }
}
EOF
$ iree-opt repro.mlir --iree-hal-target-backends=vmla --iree-transformation-pipeline

(note that it's not just VMLA. This also happens on the other backends).

repro.mlir:20:11: note: see current operation: %13 = hal.interface.load.tensor @legacy_io::@arg5, offset = %c0 : tensor<?x?xi8>                                                                                                                                       
repro.mlir:20:11: error: 'std.dim' op operand #0 must be any tensor or memref type, but got '!vmla.buffer'                                                                                                                                                            
    %17 = mhlo.and %15, %16 : tensor<?x?xi1>

Here's the local reproducer: https://gist.github.com/GMNGeoffrey/822da580266b8f459d78add5a04dc6a7 although I'm not sure if that's actually the root cause or if dim operations are supposed to be all gone by that point.

Additional context This comes from a failure in our isfinite integration test of an attempt at a reasonable lowering of tf.IsFinite to tf.NotEqual(tf.Abs(x), inf). It's possible that the bug is further upstream in the pipeline. This seems related to shape math. The input IR for that test is

module attributes {tf.versions = {bad_consumers = [], min_consumer = 12 : i32, producer = 693 : i32}, tf_saved_model.semantics}  {
  func @"__inference_<lambda>_870"(%arg0: tensor<?x?xf32> {tf._user_specified_name = "args_0", tf_saved_model.index_path = [0]}) -> (tensor<?x?xi1> {tf_saved_model.index_path = []}) attributes {tf._construction_context = "kEagerRuntime", tf._input_shapes = [#tf.shape<?x?>], tf_saved_model.exported_names = ["is_finite__nan_and_inf"]} {
    %0 = tf_executor.graph {
      %outputs, %control = tf_executor.island wraps "tf.IsFinite"(%arg0) {device = ""} : (tensor<?x?xf32>) -> tensor<?x?xi1>
      %outputs_0, %control_1 = tf_executor.island wraps "tf.Identity"(%outputs) {device = ""} : (tensor<?x?xi1>) -> tensor<?x?xi1>
      tf_executor.fetch %outputs_0 : tensor<?x?xi1>
    }
    return %0 : tensor<?x?xi1>
  }
}

silvasean commented 3 years ago

I suspect that this bug just won't happen with the new linalg on tensors path, so since the issue isn't blocking any real user, it's not worth investigating/fixing code that is going to be soon deleted, but leaving it to Mahesh to make that judgment call who has been recently looking at the shape stuff for linalg on tensors. @MaheshRavishankar

MaheshRavishankar commented 3 years ago

I thnk it wont. But there is all the shape stuff, that is largely ignored by the new path. I'll try to run it by the new path and see. Is this blocking (just for prioritization)

GMNGeoffrey commented 3 years ago

The TF folks are lowering tf.IsFinite a different way (tf.Equal(tf.Subtract(x, x), 0)) because of our test failure, but I think they really shouldn't have had to work around it and this is a bug in IREE. tf.NotEqual(tf.Abs(x), inf)) seems like a way more straightforward lowering. It's possible they had other reasons for preferring the other lowering though? One option would be disabling the IREE test and saying we don't support IsFinite right now, but this doesn't really seem to have much to do with infinities other than I suspect they prevent us from hitting some constant folding path.

MaheshRavishankar commented 3 years ago

Ok, even if I fix it on the codegen side, VMLA will still have the issue.

benvanik commented 3 years ago

it's not clear (to me) what the issue is here - tf.NotEqual and tf.Abs should both be supported - are they not? if so, it should just be adding ops?

MaheshRavishankar commented 3 years ago

I think something is off somewhere in a shape related pass. Its generating invalid IR.

GMNGeoffrey commented 3 years ago

We could disable our IsFinite test for dynamic dims for now so they don't have to work around it. I hope we can do a big pass through all our dynamic dimensions test cases when the new path lands. There's a lot of stuff that only fails with those

benvanik commented 3 years ago

what's the actual failure? need a print-ir-after-all - I doubt there's anything shape related here, but likely an unconverted op that then is keeping an unfolded dim around that is causing the error.

MaheshRavishankar commented 3 years ago

This is the dump for codegen backends (legacy) https://gist.github.com/f3f8bb3f6a904a0d1b35c6de101b372c

THis is the dump for VMLA https://gist.github.com/2ccbef8382496dff2795f8eef2fb61af

MaheshRavishankar commented 3 years ago

The linalg on tensors path gets farther along but still some shape related error I think https://gist.github.com/da04d14a677117b1beb5e14796a65131

benvanik commented 3 years ago

the issue is that an mhlo.convert op is showing up when it shouldn't be: https://gist.github.com/MaheshRavishankar/2ccbef8382496dff2795f8eef2fb61af#file-repro-vmla-after_all-mlir-L7363

those outer functions should not have any mhlo ops - or if they do, they need to be tied to shapes there

it's fine here:

  flow.executable @is_finite__nan_and_inf_ex_dispatch_3 attributes {sym_visibility = "private"} {
    flow.dispatch.entry @is_finite__nan_and_inf_ex_dispatch_3 attributes {signature = (index, index, tensor<?x?xi1>, index, index, tensor<?x?xi1>, index, index) -> tensor<?x?xi1>}
    module  {
      func @is_finite__nan_and_inf_ex_dispatch_3(%arg0: index, %arg1: index, %arg2: tensor<?x?xi1>, %arg3: index, %arg4: index, %arg5: tensor<?x?xi1>, %arg6: index, %arg7: index) -> tensor<?x?xi1> {
        %0 = shapex.make_ranked_shape %arg0, %arg1 : (index, index) -> !shapex.ranked_shape<[?,?]>
        %1 = shapex.make_ranked_shape %arg3, %arg4 : (index, index) -> !shapex.ranked_shape<[?,?]>
        %2 = shapex.make_ranked_shape %arg6, %arg7 : (index, index) -> !shapex.ranked_shape<[?,?]>
        %3 = shapex.tie_shape %arg5, %2 : tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>
        %4 = shapex.tie_shape %arg2, %1 : tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>
        %5 = "shapex.ranked_broadcast_in_dim"(%3, %0) {broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xi1>
        %6 = "shapex.ranked_broadcast_in_dim"(%4, %0) {broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xi1>
        %7 = mhlo.and %5, %6 : tensor<?x?xi1>
        return %7 : tensor<?x?xi1>
      }
    }
  }

and then something in materialize interfaces is indirectly inserting that conversion:

  hal.executable @is_finite__nan_and_inf_ex_dispatch_3 attributes {sym_visibility = "private"} {
    hal.interface @legacy_io attributes {push_constants = 6 : i32} {
      hal.interface.binding @arg2, set=0, binding=0, type="StorageBuffer", access="Read"
      hal.interface.binding @arg5, set=0, binding=1, type="StorageBuffer", access="Read"
      hal.interface.binding @ret0, set=0, binding=2, type="StorageBuffer", access="Write|Discard"
    }
    hal.executable.target @vmla, filter="vmla" {
      hal.executable.entry_point @is_finite__nan_and_inf_ex_dispatch_3 attributes {interface = @legacy_io, ordinal = 0 : i32, signature = (index, index, tensor<?x?xi1>, index, index, tensor<?x?xi1>, index, index) -> tensor<?x?xi1>}
      module  {
        func @is_finite__nan_and_inf_ex_dispatch_3() {
          %c0 = constant 0 : index
          %0 = hal.interface.load.constant offset = 0 : index
          %1 = hal.interface.load.constant offset = 1 : index
          %2 = hal.interface.load.constant offset = 2 : index
          %3 = hal.interface.load.constant offset = 3 : index
          %4 = hal.interface.load.tensor @legacy_io::@arg2, offset = %c0 : tensor<?x?xi8>
          %5 = "mhlo.convert"(%4) : (tensor<?x?xi8>) -> tensor<?x?xi1>
          %6 = hal.interface.load.constant offset = 4 : index
          %7 = hal.interface.load.constant offset = 5 : index
          %8 = hal.interface.load.tensor @legacy_io::@arg5, offset = %c0 : tensor<?x?xi8>
          %9 = "mhlo.convert"(%8) : (tensor<?x?xi8>) -> tensor<?x?xi1>
          %10 = call @is_finite__nan_and_inf_ex_dispatch_3_impl(%0, %1, %5, %2, %3, %9, %6, %7) : (index, index, tensor<?x?xi1>, index, index, tensor<?x?xi1>, index, index) -> tensor<?x?xi1>
          %11 = "mhlo.convert"(%10) : (tensor<?x?xi1>) -> tensor<?x?xi8>
          hal.interface.store.tensor %11, @legacy_io::@ret0, offset = %c0 : tensor<?x?xi8>
          return
        }
        func private @is_finite__nan_and_inf_ex_dispatch_3_impl(%arg0: index, %arg1: index, %arg2: tensor<?x?xi1>, %arg3: index, %arg4: index, %arg5: tensor<?x?xi1>, %arg6: index, %arg7: index) -> tensor<?x?xi1> {
          %0 = shapex.make_ranked_shape %arg0, %arg1 : (index, index) -> !shapex.ranked_shape<[?,?]>
          %1 = shapex.make_ranked_shape %arg3, %arg4 : (index, index) -> !shapex.ranked_shape<[?,?]>
          %2 = shapex.make_ranked_shape %arg6, %arg7 : (index, index) -> !shapex.ranked_shape<[?,?]>
          %3 = shapex.tie_shape %arg5, %2 : tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>
          %4 = shapex.tie_shape %arg2, %1 : tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>
          %5 = "shapex.ranked_broadcast_in_dim"(%3, %0) {broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xi1>
          %6 = "shapex.ranked_broadcast_in_dim"(%4, %0) {broadcast_dimensions = dense<[0, 1]> : tensor<2xi64>} : (tensor<?x?xi1>, !shapex.ranked_shape<[?,?]>) -> tensor<?x?xi1>
          %7 = mhlo.and %5, %6 : tensor<?x?xi1>
          return %7 : tensor<?x?xi1>
        }
        hal.interface @legacy_io attributes {push_constants = 6 : i32, sym_visibility = "private"} {
          hal.interface.binding @arg2, set=0, binding=0, type="StorageBuffer", access="Read"
          hal.interface.binding @arg5, set=0, binding=1, type="StorageBuffer", access="Read"
          hal.interface.binding @ret0, set=0, binding=2, type="StorageBuffer", access="Write|Discard"
        }
      }
    }
  }

it's this adding them: https://github.com/google/iree/blob/main/iree/compiler/Dialect/HAL/Utils/TypeUtils.cpp#L53-L62

which is coming only on the legacy (non-linalg) path: https://github.com/google/iree/blob/main/iree/compiler/Dialect/HAL/Transforms/MaterializeInterfaces.cpp#L343-L358

how this ever works I don't know - it shouldn't - but the root cause is that there's an i1 input and a shady conversion inserted without preserving the requirements of the IR (that shapes are known there)

if this was a regression then maybe something changed with pass ordering, shape inference, or canonicalizations/folders around casting

this entire concept (inserting that wrapper function) does not exist in the linalg-on-tensors world, so I'd say enable the test for linalg-on-tensors and/or keep it disabled until that's used

benvanik commented 3 years ago

(you could also make the test static - then there should be no issue - a test of whether a number is finite shouldn't also be our load-bearing test for generic dynamic shapes on converted types :)

GMNGeoffrey commented 3 years ago

We have tests for static and dynamic. None of these tests should be nearly as load bearing as they are, but that requires more tests in IREE core to bear said load...

benvanik commented 3 years ago

yep - just sounded like this was blocking upstream tf stuff because of this test - if you take the hlo output and check that into iree and then make the test not use dynamic stuff you can unblock the tf changes

benvanik commented 2 years ago

shapex is dead!

iree-org / iree

Failure to lower std.dim operations in shape math #5044