Open ziereis opened 1 week ago
@hanhanW seems like something with the vectorizer potentially?
Cc @pashu123 as well. Seems like a tile size issue
@ziereis Could you change https://github.com/iree-org/iree/blob/f42b90d23c332bee6dedd1c8f44e07b9b1a52f74/compiler/src/iree/compiler/Codegen/LLVMCPU/Passes.cpp#L408 with
funcPassManager.addPass(createLLVMCPUTileRootAndFuseInputOperands(i));
and try.
@pashu123 i tested it with this example and a couple other ones that failed and they all compile with this fix.
@ziereis For context, this was introduced in https://github.com/iree-org/iree/pull/18114, but we only enabled this for convExpert Pipeline.
I can not reproduce the issue because the target CPU is not specified. Can you provide the log with --mlir-print-ir-after-all --mlir-disable-threading
?
btw, I think the issue is not related to LLVMCPUTileRootAndFuseInputOperands
? They are two reductions and they are formed into different dispatches. The issue is that we get large tile sizes in lowering_config.
I can not reproduce the issue because the target CPU is not specified. Can you provide the log with
--mlir-print-ir-after-all --mlir-disable-threading
?btw, I think the issue is not related to
LLVMCPUTileRootAndFuseInputOperands
? They are two reductions and they are formed into different dispatches. The issue is that we get large tile sizes in lowering_config.
I was also surprised, but when I looked at the dispatches, the last was one fused unpack + reduction.
I see. I think they are batch_matmul in generic op form, so data-tiling is kicked in. And I don't have cpu_features because the cpu target is not specified, so those encodings are dropped. Thus I'm not able to reproduce it. It's easier if @ziereis can provide the IR dumps.
sorry for not providing the flags. Here is the full command:
./build/tools/iree-compile --iree-hal-target-device=llvm-cpu --iree-llvmcpu-target-cpu=znver4 reproducer.mlir -o out.vmfb
The ir dump is also attached
What happened?
Compilation to llvm-cpu fails with error: One or more operations with large vector sizes (8192 bytes) were found
Input IR:
This fails to compile, by changing the second dimensions of the tensors i.e. 256 in this case you can get it to compile. For example 32 works.
example error:
Steps to reproduce your issue
iree-compile --iree-hal-target-device=llvm-cpu input.mlir
What component(s) does this issue relate to?
Compiler
Version information
9c85e30
Additional context
No response