iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.56k stars 568 forks source link

[VULKAN] Fail to compile `singlepose-lightning-tflite-int8` TFlite of Movenet to `.vmfb` file. #17902

Open RechieKho opened 1 month ago

RechieKho commented 1 month ago

What happened?

I'm trying to compile singlepose-lightning-tflite-int8 tensorflow lite file. These are the commands I ran:

iree-import-tflite model.tflite -o model.mlir # Success.
iree-compile --iree-input-type=tosa --iree-hal-target-backends=vulkan-spirv model.mlir -o model.vmfb # Failure.

There is no output during the import (running iree-import-tflite). Errors pop up when compile it. Here is the output of the command, error.log. Very unfortunate that the output is too big to inline them.

Steps to reproduce your issue

As explained above.

What component(s) does this issue relate to?

No response

Version information

Here is the print of iree-compile --version:

IREE (https://iree.dev):
  IREE compiler version 20240621.931 @ ac418d1f45d562bf9e9675bf69606c7d718e2432
  LLVM version 19.0.0git
  Optimized build

Additional context

I have tried to compile esrgan with this version. It is a success. I've tried to compile this on other machine (A MacBook) as I'm currently using arch linux (btw), still doesn't work.

kuhar commented 1 month ago
<unknown>:0: error: loc(callsite("center_net_mobile_net_v2fpn_feature_extractor/model_1/model/expanded_conv_project_BN/FusedBatchNormV3;center_net_mobile_net_v2fpn_feature_extractor/model_1/model/expanded_conv_project/Conv2D1" at "main")): 'spirv.VectorShuffle' op result #0 must be vector of bool or 8/16/32/64-bit integer or 16/32/64-bit float values of length 2/3/4/8/16, but got 'i32'
<unknown>:0: note: loc("main"): called from
        %699 = "spirv.VectorShuffle"(%684, %684) <{components = [0 : i32]}> : (vector<4xi32>, vector<4xi32>) -> i32
        %700 = "spirv.BitwiseAnd"(%699, %3) : (i32, i32) -> i32
        %701 = "spirv.VectorShuffle"(%684, %684) <{components = [1 : i32]}> : (vector<4xi32>, vector<4xi32>) -> i32
        %702 = "spirv.BitwiseAnd"(%701, %3) : (i32, i32) -> i32
        %703 = "spirv.ShiftLeftLogical"(%702, %2) : (i32, i32) -> i32
        %704 = "spirv.BitwiseOr"(%700, %703) : (i32, i32) -> i32
        %705 = "spirv.VectorShuffle"(%684, %684) <{components = [2 : i32]}> : (vector<4xi32>, vector<4xi32>) -> i32
        %706 = "spirv.BitwiseAnd"(%705, %3) : (i32, i32) -> i32
        %707 = "spirv.ShiftLeftLogical"(%706, %1) : (i32, i32) -> i32
        %708 = "spirv.BitwiseOr"(%704, %707) : (i32, i32) -> i32
        %709 = "spirv.VectorShuffle"(%684, %684) <{components = [3 : i32]}> : (vector<4xi32>, vector<4xi32>) -> i32
        %710 = "spirv.BitwiseAnd"(%709, %3) : (i32, i32) -> i32
        %711 = "spirv.ShiftLeftLogical"(%710, %0) : (i32, i32) -> i32
        %712 = "spirv.BitwiseOr"(%708, %711) : (i32, i32) -> i32
        %713 = "spirv.VectorShuffle"(%683, %683) <{components = [0 : i32]}> : (vector<4xi32>, vector<4xi32>) -> i32
        %714 = "spirv.BitwiseAnd"(%713, %3) : (i32, i32) -> i32
        %715 = "spirv.VectorShuffle"(%683, %683) <{components = [1 : i32]}> : (vector<4xi32>, vector<4xi32>) -> i32
        %716 = "spirv.BitwiseAnd"(%715, %3) : (i32, i32) -> i32
        %717 = "spirv.ShiftLeftLogical"(%716, %2) : (i32, i32) -> i32
        %718 = "spirv.BitwiseOr"(%714, %717) : (i32, i32) -> i32
        %719 = "spirv.VectorShuffle"(%683, %683) <{components = [2 : i32]}> : (vector<4xi32>, vector<4xi32>) -> i32
        %720 = "spirv.BitwiseAnd"(%719, %3) : (i32, i32) -> i32
        %721 = "spirv.ShiftLeftLogical"(%720, %1) : (i32, i32) -> i32
        %722 = "spirv.BitwiseOr"(%718, %721) : (i32, i32) -> i32
        %723 = "spirv.VectorShuffle"(%683, %683) <{components = [3 : i32]}> : (vector<4xi32>, vector<4xi32>) -> i32

Not sure what emits those. @RechieKho could you upload the log output with --mlir-print-ir-after-all?

RechieKho commented 1 month ago

Since the dump is very big (like 1.4GB), you requested searching for shuffle keyword, and here is the result: shuffle_filter.log

I managed to upload to google drive, here is the dump.

kuhar commented 1 month ago

I think this is is the minimal repro:

func.func @shuffle(%v0 : vector<4xi32>, %v1: vector<4xi32>) -> vector<1xi32> {
  %shuffle = vector.shuffle %v0, %v1 [0] : vector<4xi32>, vector<4xi32>
  return %shuffle : vector<1xi32>
}
mlir-opt shuffle.mlir --convert-vector-to-spirv --debug-only=dialect-conversion
shuffle.mlir:2:14: error: 'spirv.VectorShuffle' op result #0 must be vector of bool or 8/16/32/64-bit integer or 16/32/64-bit float values of length 2/3/4/8/16, but got 'i32'
  %shuffle = vector.shuffle %v0, %v1 [0] : vector<4xi32>, vector<4xi32>
             ^
shuffle.mlir:2:14: note: see current operation: %0 = "spirv.VectorShuffle"(%arg0, %arg1) <{components = [0 : i32]}> : (vector<4xi32>, vector<4xi32>) -> i32
kuhar commented 1 month ago

LLVM PR with the fix: https://github.com/llvm/llvm-project/pull/98809

ScottTodd commented 1 month ago

Thanks for the quick fix! To keep this working, we can add https://www.kaggle.com/models/google/movenet/tfLite to https://github.com/nod-ai/SHARK-TestSuite/tree/main/iree_tests. We don't have an example coming from .tflite there yet, so this could be the first. I'll add that to my queue.

RechieKho commented 1 month ago

I've attempted it again with the newest compiler. The vector shuffle error is gone, but error with vector.broadcast #17976 popped up instead. I guess vector.broadcast bug affected all the different targets. I'll let this issue remain open until the test is completed.