Open JerryShih opened 2 days ago
Couple of things here.
1) I would be very careful about using the --iree-llvmcpu-number-of-threads=1
. See https://github.com/iree-org/iree/blob/0c2c627747586ed39ce7b1f6bfc9d8b83c4a4e69/compiler/src/iree/compiler/Codegen/LLVMCPU/KernelDispatch.cpp#L44 .
Without these flags if I compile using
iree-compile --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-enable-ukernels=all
This compiles fine and actually looking at generated code it does what you expect but I also get this warning that is relevant.
This can be done in two ways:
1. With command-line flags:
--iree-llvmcpu-target-cpu=...
--iree-llvmcpu-target-cpu-features=...
2. Within the IR:
#hal.executable.target< ... , cpu="...", cpu_features="...">
In the rest of this message, these fields are referred to as just `cpu` and `cpu_features`.
Examples:
cpu=generic
Target a generic CPU of the target architecture. The generated code will have poor performance, but will run on any CPU.
cpu=host
Target the host CPU. The generated code will have optimal performance on the host CPU but will crash on other CPUs not supporting the same CPU features.
cpu="name"
Target a specific CPU. This is mostly used on x86. The accepted values are the same as in Clang command lines.
List of accepted x86 CPUs: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake, cooperlake, cannonlake, icelake-client, rocketlake, icelake-server, tigerlake, sapphirerapids, alderlake, raptorlake, meteorlake, arrowlake, arrowlake-s, lunarlake, gracemont, pantherlake, sierraforest, grandridge, graniterapids, graniterapids-d, emeraldrapids, clearwaterforest, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2, znver3, znver4, znver5, x86-64, x86-64-v2, x86-64-v3, x86-64-v4
cpu_features="+feature1,..."
Target a CPU supporting the comma-separated of (+-prefixed) features. The accepted values are the same as in Clang command lines.
What happened?
With
--iree-llvmcpu-enable-ukernels=all
or--iree-llvmcpu-enable-ukernels=unpack
, the IREE will report the following message for tinyllama model inLLVMCPUCheckIRBeforeLLVMConversionPass
pass:It uses dynamic tensors as bmm's input.
Steps to reproduce your issue
Here is the simplified tinyllama model with dynamic shape: simple_tinyllama.mlir
What component(s) does this issue relate to?
Compiler
Version information
IREE: https://github.com/iree-org/iree/commit/3b751a4d2797d29422e08327b1a53933448a26fd
Additional context
No response