llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.02k stars 11.96k forks source link

[lld][RISC-V] `mv rd rd` used in global symbol address calculation could be eliminated. #100290

Open wan-yuqi opened 3 months ago

wan-yuqi commented 3 months ago

When compiling a program which uses a global symbol and targets riscv*, it will generate a sequence instructions for calculating address:

foo:
  lui  a0, %hi(Var)
  addi a0, a0, %lo(Var)

When the address of Var is aligned to 0x1000, such as 0x20000, calculation instructions will be:

foo:
  lui  a0, 20
  mv   a0, a0   # pseudo instruction of `addi a0, a0, 0`

I think mv a0, a0 could be eliminated by linker relaxation (which the target and source registers are the same and R_RISCV_LO12_I is ADDI) to reduce unnecessary instructions. But it' s not mentioned in riscv-elf-psabi-doc. Is there any negative impact?

llvmbot commented 3 months ago

@llvm/issue-subscribers-backend-risc-v

Author: nomore (wan-yuqi)

When compiling a program which uses a global symbol and targets `riscv*`, it will generate a sequence instructions for calculating address: ``` foo: lui a0, %hi(Var) addi a0, a0, %lo(Var) ``` When the address of `Var` is aligned to 0x1000, such as 0x20000, calculation instructions will be: ``` foo: lui a0, 20 mv a0, a0 # pseudo instruction of `addi a0, a0, 0` ``` I think `mv a0, a0` could be eliminated by linker relaxation (which the target and source registers are the same and R_RISCV_LO12_I is `ADDI`) to reduce unnecessary instructions. But it' s not mentioned in riscv-elf-psabi-doc. Is there any negative impact?
tianboh commented 3 months ago

Not 100% sure, just some thoughts.

addi a0, a0, 0 is also used as nop for alignment. Even though the variable Var is aligned doesn't indicate this instruction lui is properly aligned. So some instructions around may need addi a0, a0, 0 to align themselves.

wan-yuqi commented 3 months ago

Not 100% sure, just some thoughts.

addi a0, a0, 0 is also used as nop for alignment. Even though the variable Var is aligned doesn't indicate this instruction lui is properly aligned. So some instructions around may need addi a0, a0, 0 to align themselves.

The nop instruction used for alignment has a relocation.type of R_RISCV_ALIGN, while the nop here is actually R_RISCV_LO12_I. So i think there should be no impact.