mul instruction is generated on ATtiny, even though it is not supported

cpldcpu commented 8 years ago

Example code:

#include <avr/io.h>
void test(uint8_t *ptr, uint16_t index) {
    ptr[index+index+index]=7;
}

Interestingly LLVM infers a multiplication from the triple addition. This may be beneficial on some architectures, but certainly is not so on AVR.

To make things worse, very inefficient code with 8x single bit shift and multiplications is generated even on the ATtiny, where it is not supported.

The same code is generated for ATmega, ATtiny, and the tiny core:

    .text
    .file   "test2.c"
    .globl  test
    .p2align    1
    .type   test,@function
test:                                   ; @test
; BB#0:                                 ; %entry
    ldi r18, 3
    muls    r23, r18
    mov r19, r0
    eor r1, r1
    mul r22, r18
    mov r20, r1
    eor r1, r1
    add r20, r19
    lsl r20
    rol r21
    lsl r20
    rol r21
    lsl r20
    rol r21
    lsl r20
    rol r21
    lsl r20
    rol r21
    lsl r20
    rol r21
    lsl r20
    rol r21
    lsl r20
    rol r21
    mov r26, r0
    eor r27, r27
    or  r26, r20
    or  r27, r21
    add r26, r24
    adc r27, r25
    ldi r24, 7
    st  X, r24
    ret
.Lfunc_end0:
    .size   test, .Lfunc_end0-test

shepmaster commented 8 years ago

very inefficient code with 8x single bit shift

That's a side-effect of some 16-bit fakery that happens.

Multiplication should be able to be gated; there's a supportsMultiplication flag scattered about. Perhaps some variant of the -mattr=mul flag? Although I'd hope that specifying the Tiny would set all the right things.

How are you selecting which AVR variant you are using? Can you include the compiler command?

cpldcpu commented 8 years ago

Sure. This is the compiler command I used:

TARGET=attiny85
DEVICE=__AVR_ATtiny85__

clang -O3 -c -S -I /usr/lib/avr/include -D $DEVICE --target=avr $1.c -o $1_llvm_t85.s -mmcu=$TARGET

cpldcpu commented 8 years ago

That's a side-effect of some 16-bit fakery that happens.

The shifts are effectively used to more one byte from one register to the other. Seems somewhat odd.

avr-llvm / llvm

mul instruction is generated on ATtiny, even though it is not supported #216