Bugzilla Link	32366
Version	trunk
OS	Linux
Reporter	LLVM Bugzilla Contributor
CC	@efriedma-quic

Extended Description

(also affects AArch64, but I can only tag one component)

clang --target=arm-linux-gneabihf -march=armv7-a -O3 -S -xc -o- - << EOF

include

uint16_t ld16(uint8_t const* p) { uint16_t r; __builtin_memcpy(&r, p, sizeof(r)); return r; }

uint16_t ld16_bytes(uint8_t const* p) { uint16_t r = p[0] | (p[1] << 8); return r; } EOF

gives:

ld16: ldrh r0, [r0] bx lr

ld16_bytes: ldrb r1, [r0] ldrb r0, [r0, #1] orr r0, r1, r0, lsl #8 bx lr

For little endian targets I would expect both these functions to turn out the same (like they do on x86). 32-bit and 64-bit loads seem to be rewritten already; it's just 16-bit where they diverge.

Note that -march=armv7-a (or similar) is necessary to make the unaligned load option available.

llvm / llvm-project

byte loads aren't fused into unaligned 16-bit load on ARM/AArch64 #31713

Extended Description

include