Open newling opened 1 week ago
Hi @newling,
yes this instruction is multiplying two integer vectors <2 x s32>
, and isn't supported natively.
It can be implemented by extracting the scalar elements and executing the MUL
instruction twice.
That legalization is currently missing in our AIELegalizerInfo.cpp
Thanks Konstantin.
The above contains files to reproduce.
To observe the error, run
./peano/install/bin/llc input_after.opt.ll -O2 --march=aie2 --function-sections --filetype=obj
where input_after.opt.ll
is a file in the attached zip file (~300 line file). The error is:
LLVM ERROR: unable to legalize instruction: %512:_(<2 x s32>) = G_MUL %378:_, %390:_ (in function: core_0_2)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: ../../../peano_github/install/bin/llc results_dir_tmp/module_matmul_8x32_16xi32__dispatch_0_amdaie_xclbin_fb/input.opt.ll -O2 --march=aie2 --function-sections --filetype=obj
The file input_before.opt.ll
is a very similar file, added a reference, for an example which compiles.
Background: these files are from a int32 transposed matmul. The difference between 'before' and 'after' is a minor change in how loops are unrolled.
@konstantinschwarz I'm reopening if that's ok, to track support for this. Please close if you have other ways of tracking such things..
Note that input_after.opt.ll in the reproducer zip file is created from input_after.ll as:
install/bin/opt -O2 --inline-threshold=10 -S input_after.ll --disable-builtin=memset -o input_after.opt.ll
If the optimization level is lowered to 1, i.e. if input_after.opt.ll is created as
install/bin/opt -O1 --inline-threshold=10 -S input_after.ll --disable-builtin=memset -o input_after.opt.ll
Then the subsequent call to
install/bin/llc input_after.opt.ll -O2 --march=aie2 --function-sections --filetype=obj
Generates the .o file without a problem. So this problem is specific to running opt
with -On
for n > 1
.
Is there some kind of auto-vectorization that is happening in opt
with O2
which should not be, especially for i32 types?
@konstantinschwarz @MaheshRavishankar @jsetoain
I'd need to look at the pass pipeline that is run with opt
, but we are explicitly disabling the autovectorizer from the clang
driver here: https://github.com/Xilinx/llvm-aie/blob/aie-public/clang/lib/Driver/ToolChains/AIE.cpp#L96
I think @jsetoain was running into a similar issue with opt
before
I'd need to look at the pass pipeline that is run with
opt
, but we are explicitly disabling the autovectorizer from theclang
driver here: https://github.com/Xilinx/llvm-aie/blob/aie-public/clang/lib/Driver/ToolChains/AIE.cpp#L96I think @jsetoain was running into a similar issue with
opt
before
--Notice that it's llc
doing the autovectorization in this case. IREE also runs opt
, but (correct me if I am wrong, @newling ) that one did not vectorize the IR.-- Never mind, I misread.
The only issue with opt
that I recall was that it wasn't propagating alignment assumptions, and that was user error.
It's opt which is doing the vectorization which we don't want. See screenshot of why I say this, below:
i.e. opt with -O2 introduces the vector mul
op.
(where I did export OPT=path/to/my/install/bin/opt
)
It's opt which is doing the vectorization which we don't want. See screenshot of why I say this, below:
i.e. opt with -O2 introduces the vector
mul
op.(where I did
export OPT=path/to/my/install/bin/opt
)
If you are calling opt
manually, you have to pass --vectorize-loops=false --vectorize-slp=false
to disable the vectorizers. We can extend our legalization in the backend to scalarize illegal vector types/operations, but ultimately we probably want to have a better cost model for those auto-vectorizers
The error observed:
Is this instruction multiplying 2 vectors containing 2 32-bit signed integers together?
Just checking, as my understanding is that this is not a supported operation on AIE.
Thanks
UPDATE: reproducer attached in comment below: https://github.com/Xilinx/llvm-aie/issues/102#issuecomment-2195908850