Open llvmbot opened 3 years ago
Ah, I see. That makes sense. Thanks!
The default architecture for Arm is very old now, and you really have to specify something newer. This is for -march=armv7-a, which is inline with what I would expect: https://godbolt.org/z/qnWrzGKMs
read_unaligned_memcpy_bswap_32(unsigned char const, int): ldr r0, [r0, r1] rev r0, r0 bx lr read_unaligned_shift_add_32(unsigned char const, int): ldr r0, [r0, r1] rev r0, r0 bx lr
Extended Description
Also applies to: armv7-a clang 11.0.1
Godbolt: https://godbolt.org/z/9f1GKf5qe
Consider this code:
include
uint32_t read_unaligned_memcpy_bswap_32(const uint8_t *buf, int offset) { uint32_t val; __builtin_memcpy(&val, buf+offset, 4); return __builtin_bswap32(val); }
uint32_t read_unaligned_shift_add_32(const uint8_t *buf, int offset) { return (((uint32_t)buf[offset]) << 24) + (((uint32_t)buf[offset+1]) << 16) + (((uint32_t)buf[offset+2]) << 8) + (((uint32_t)buf[offset+3]) << 0); }
On many architectures, eg. ARMv8, these produce identical and efficient code. On ARMv7a, __builtin_bswap32 version produces what looks like worse code compared to the shift+add version (although I admit I don't know the architecture well enough to be sure, but at least the result has 14 instructions as opposed to 8):
read_unaligned_memcpy_bswap_32(unsigned char const*, int): ldrb r1, [r0, r1]! ldrb r2, [r0, #1] ldrb r3, [r0, #2] ldrb r0, [r0, #3] orr r1, r1, r2, lsl #8 orr r0, r3, r0, lsl #8 mov r2, #16711680 orr r0, r1, r0, lsl #16 mov r1, #65280 and r1, r1, r0, lsr #8 and r2, r2, r0, lsl #8 orr r1, r1, r0, lsr #24 orr r0, r2, r0, lsl #24 orr r0, r0, r1 bx lr
read_unaligned_shift_add_32(unsigned char const*, int): ldrb r1, [r0, r1]! ldrb r2, [r0, #1] ldrb r3, [r0, #2] ldrb r0, [r0, #3] lsl r2, r2, #16 orr r1, r2, r1, lsl #24 orr r1, r1, r3, lsl #8 orr r0, r1, r0 bx lr
The same applies to the 16-bit version (see the Godbolt link for code), but the difference is much less dramatic (also there trunk generates one instruction more compared to 11.0.1 for the 16-bit bswap version; I don't know how significant that is).