llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.34k stars 12.13k forks source link

byte loads aren't fused into unaligned 16-bit load on ARM/AArch64 #31713

Open llvmbot opened 7 years ago

llvmbot commented 7 years ago
Bugzilla Link 32366
Version trunk
OS Linux
Reporter LLVM Bugzilla Contributor
CC @efriedma-quic

Extended Description

(also affects AArch64, but I can only tag one component)

clang --target=arm-linux-gneabihf -march=armv7-a -O3 -S -xc -o- - << EOF

include

uint16_t ld16(uint8_t const* p) { uint16_t r; __builtin_memcpy(&r, p, sizeof(r)); return r; }

uint16_t ld16_bytes(uint8_t const* p) { uint16_t r = p[0] | (p[1] << 8); return r; } EOF

gives:

ld16: ldrh r0, [r0] bx lr

ld16_bytes: ldrb r1, [r0] ldrb r0, [r0, #​1] orr r0, r1, r0, lsl #​8 bx lr

For little endian targets I would expect both these functions to turn out the same (like they do on x86). 32-bit and 64-bit loads seem to be rewritten already; it's just 16-bit where they diverge.

Note that -march=armv7-a (or similar) is necessary to make the unaligned load option available.

efriedma-quic commented 7 years ago

We have some limited support for this sort of thing (see https://reviews.llvm.org/D27861), but it doesn't catch this particular case because of the zero-extend. It could probably be extended, though.