Open michaelselehov opened 5 months ago
@goldsteinn would you please take a look?
@llvm/issue-subscribers-backend-amdgpu
Author: None (michaelselehov)
@goldsteinn would you please take a look?
ping @goldsteinn
Just a drive-by comment: 54ec8bcaf85e3a3341c97640331d58e24ac0d2cd seems pretty harmless to me and it would probably be more productive for an AMDGPU expert to look into why it exposed worse codegen.
Summary After the commit 54ec8bcaf85e3a3341c97640331d58e24ac0d2cd one of the AMDGPU benchmarks regressed because the assembler became longer.
Reduced Input IR See input.ll in the attached archive
Steps to Reproduce
Run opt with these parameters: opt -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -O3 -vectorize-loops -vectorize-slp -amdgpu-early-inline-all=true -amdgpu-function-calls=false -S -o output.ll input.ll
Then run llc with these parameters: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -O3 -vectorize-loops -vectorize-slp -amdgpu-early-inline-all=true -amdgpu-function-calls=false -o output.s output.ll
Additional Information The input IR includes a complicated set of binary operations. The commit tries to fold the instructions but instead produces extra code, leading to the regression.
Please find the attachments for detailed comparison and further analysis.
input and output.zip