llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.8k stars 11.91k forks source link

Unaligned read followed by bswap generates suboptimal code #48314

Open eduardosm opened 3 years ago

eduardosm commented 3 years ago
Bugzilla Link 48970
Version 11.0
OS Linux
CC @asb,@topperc,@eduardosm

Extended Description

On RISC-V, an unaligned read followed by a bswap produces suboptimal code.

Given the following IR:

declare i16 @​llvm.bswap.i16(i16) define i16 @​read(i16 %p) { start: %v = load i16, i16 %p, align 1 ret i16 %v } define i16 @​read_swap(i16 %p) { start: %v = load i16, i16 %p, align 1 %v2 = tail call i16 @​llvm.bswap.i16(i16 %v) ret i16 %v2 }

compiled with llc -mtriple=riscv64-unknown-linux-gnu -O3

it produces the following assembly:

read: lb a1, 1(a0) lbu a0, 0(a0) slli a1, a1, 8 or a0, a0, a1 ret read_swap: lb a1, 1(a0) lbu a0, 0(a0) slli a1, a1, 8 or a0, a0, a1 slli a1, a0, 40 addi a2, zero, 255 slli a2, a2, 48 and a1, a1, a2 slli a0, a0, 56 or a0, a0, a1 srli a0, a0, 48 ret

The code for read is generated as expected. However, the code for read_swap can be simplified to:

read_swap: lb a1, 0(a0) lbu a0, 1(a0) slli a1, a1, 8 or a0, a0, a1 ret

topperc commented 3 years ago

CodeGen should be improved after 70289ea6f591bd39c631f1eee3e6f2622fbc1d46, but its still not perfect.

New codegen

read_swap: lbu a1, 1(a0) lbu a0, 0(a0) slli a1, a1, 8 or a0, a1, a0 srli a1, a0, 8 slli a0, a0, 8 or a0, a0, a1 ret