iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.58k stars 579 forks source link

'util.initializer' op failed to inline into combined initializer #18386

Open jinchen62 opened 1 month ago

jinchen62 commented 1 month ago

What happened?

The further error after https://github.com/iree-org/iree/issues/18232

repro.mlir:107:12: error: 'util.initializer' op failed to inline into combined initializer
    %414 = affine.apply #map32()[%408, %409, %247]
           ^
repro.mlir:7:3: note: called from
  func.func @torch_jit() -> index {
  ^
repro.mlir:107:12: note: see current operation: 
"util.initializer"() <{function_type = () -> ()}> ({
  %0 = "arith.constant"() <{value = 0 : i8}> : () -> i8
  %1 = "util.global.load"() <{global = @__hoisted_i32_21}> : () -> i32
  %2 = "stream.tensor.sizeof"() <{affinity = #hal.device.affinity<@__device_0>, encoding = tensor<1xi32>}> : () -> index
  %3 = "stream.tensor.splat"(%0, %2) <{affinity = #hal.device.affinity<@__device_0>, result_encoding = tensor<1xi32>}> : (i8, index) -> !stream.resource<*>
  %4 = "stream.tensor.sizeof"() <{affinity = #hal.device.affinity<@__device_0>, encoding = tensor<i32>}> : () -> index
  %5 = "stream.async.transfer"(%3, %4, %4) <{result_affinity = #hal.device.affinity<@__device_0>, source_affinity = #hal.device.affinity<@__device_0>}> : (!stream.resource<*>, index, index) -> !stream.resource<staging>
  %6 = "stream.tensor.load"(%5, %4) <{operandSegmentSizes = array<i32: 1, 0, 1, 0>, source_encoding = tensor<i32>}> : (!stream.resource<staging>, index) -> i32
  %7 = "arith.index_cast"(%6) : (i32) -> index
  %8 = "arith.index_cast"(%1) : (i32) -> index
  %9 = "affine.apply"(%7, %8) <{map = affine_map<()[s0, s1] -> (s0 + s1 + 1)>}> : (index, index) -> index
  %10 = "arith.index_cast"(%9) : (index) -> i64
  "util.global.store"(%10) <{global = @__hoisted_i64}> : (i64) -> ()
  "util.return"() : () -> ()
}) {stream.affinity.default = #hal.device.affinity<@__device_0>} : () -> ()
    %414 = affine.apply #map32()[%408, %409, %247]
           ^

Steps to reproduce your issue

  1. Repro: https://gist.github.com/jinchen62/79d3f497ad8ef45c30b3f9803390685d
  2. iree-compile --iree-input-demote-i64-to-i32 --iree-hal-target-backends=llvm-cpu repro.mlir -o test.vmfb

What component(s) does this issue relate to?

No response

Version information

Build TOM locally

Additional context

No response

benvanik commented 1 month ago

It's not the bug here, but I'd look further back in the IR for where this is coming from - it's splatting a 0 value into a buffer to then load back that 0 and store it in a global that then somewhere is likely being loaded in a dispatch as if it's anything but a 0 - that's really dumb. Probably indicates a missing folder earlier on.

benvanik commented 1 month ago

(consteval may handle this case but not others in the program if this is a new construct we haven't optimized yet, and running full consteval to do this slows down compilation)