AMDGPU Benchmark Regression: Increased Assembler Length After Commit 54ec8bcaf85e

michaelselehov commented 5 months ago

Summary After the commit 54ec8bcaf85e3a3341c97640331d58e24ac0d2cd one of the AMDGPU benchmarks regressed because the assembler became longer.

Reduced Input IR See input.ll in the attached archive

Steps to Reproduce

Run opt with these parameters: opt -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -O3 -vectorize-loops -vectorize-slp -amdgpu-early-inline-all=true -amdgpu-function-calls=false -S -o output.ll input.ll
Then run llc with these parameters: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -O3 -vectorize-loops -vectorize-slp -amdgpu-early-inline-all=true -amdgpu-function-calls=false -o output.s output.ll

Additional Information The input IR includes a complicated set of binary operations. The commit tries to fold the instructions but instead produces extra code, leading to the regression.

Please find the attachments for detailed comparison and further analysis.

input and output.zip

michaelselehov commented 5 months ago

@goldsteinn would you please take a look?

llvmbot commented 5 months ago

@llvm/issue-subscribers-backend-amdgpu

Author: None (michaelselehov)

**Summary** After the commit 54ec8bcaf85e3a3341c97640331d58e24ac0d2cd one of the AMDGPU benchmarks regressed because the assembler became longer. **Reduced Input IR** See input.ll in the attached archive **Steps to Reproduce** * Run opt with these parameters: opt -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -O3 -vectorize-loops -vectorize-slp -amdgpu-early-inline-all=true -amdgpu-function-calls=false -S -o output.ll input.ll * Then run llc with these parameters: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -O3 -vectorize-loops -vectorize-slp -amdgpu-early-inline-all=true -amdgpu-function-calls=false -o output.s output.ll **Additional Information** The input IR includes a complicated set of binary operations. The commit tries to fold the instructions but instead produces extra code, leading to the regression. Please find the attachments for detailed comparison and further analysis. [input and output.zip](https://github.com/llvm/llvm-project/files/15388366/input.and.output.zip)

ronlieb commented 5 months ago

@goldsteinn would you please take a look?

ping @goldsteinn

jayfoad commented 5 months ago

Just a drive-by comment: 54ec8bcaf85e3a3341c97640331d58e24ac0d2cd seems pretty harmless to me and it would probably be more productive for an AMDGPU expert to look into why it exposed worse codegen.

llvm / llvm-project

AMDGPU Benchmark Regression: Increased Assembler Length After Commit 54ec8bcaf85e #92891