Open cpldcpu opened 8 years ago
very inefficient code with 8x single bit shift
That's a side-effect of some 16-bit fakery that happens.
Multiplication should be able to be gated; there's a supportsMultiplication
flag scattered about. Perhaps some variant of the -mattr=mul
flag? Although I'd hope that specifying the Tiny would set all the right things.
How are you selecting which AVR variant you are using? Can you include the compiler command?
Sure. This is the compiler command I used:
TARGET=attiny85
DEVICE=__AVR_ATtiny85__
clang -O3 -c -S -I /usr/lib/avr/include -D $DEVICE --target=avr $1.c -o $1_llvm_t85.s -mmcu=$TARGET
That's a side-effect of some 16-bit fakery that happens.
The shifts are effectively used to more one byte from one register to the other. Seems somewhat odd.
Example code:
Interestingly LLVM infers a multiplication from the triple addition. This may be beneficial on some architectures, but certainly is not so on AVR.
To make things worse, very inefficient code with 8x single bit shift and multiplications is generated even on the ATtiny, where it is not supported.
The same code is generated for ATmega, ATtiny, and the tiny core: