Loop Index Optimization to avoid cmp

llvm-mos / llvm-mos

Port of LLVM to the MOS 6502 and related processors

Other

406 stars 45 forks source link

A common 6502 loop optimization is to avoid the compare on the loop iterator when doing a loop that doesn't use the iterator for indirect addressing. Two ways that this optimization can be implemented is to either change the iterator to count down, or offset all of the absolute address accesses by the final loop counter.

Here's the standard copy loop as generated by llvm-mos today:

constexpr int kNumEntities = 20;
char screen_x;
extern char x_pos[kNumEntities];
void loop() {
    for (int i = 0; i < kNumEntities; ++i) {
        x_pos[i] += screen_x;
    }
}

loop():                               ; @loop()
        ldx     #0
        ldy     screen_x
.LBB0_1:                                ; =>This Inner Loop Header: Depth=1
        tya
        clc
        adc     x_pos,x
        sta     x_pos,x
        inx
        cpx     #20
        bne     .LBB0_1
        rts

But with this optimization, the loop could become the following. The accesses to the addresses end up being in the same order, which makes it a pretty safe optimization in general.

loop():                               ; @loop()
        ldx     #256 - 20
        ldy     screen_x
.LBB0_1:                                ; =>This Inner Loop Header: Depth=1
        tya
        clc
        adc     x_pos - (256 - 20),x
        sta     x_pos - (256 - 20),x
        inx
        bne     .LBB0_1
        rts

0000a01f <main>: a01f: a2 ec ldx #$ec a021: ac 00 02 ldy $200 ; 0x200 <screen_x> a024: 98 tya a025: 18 clc a026: 75 34 adc $34,x ; 0x34 <x_pos+0x14> a028: 95 34 sta $34,x ; 0x34 <x_pos+0x14> a02a: e8 inx a02b: d0 f7 bne $a024 <main+0x5> a02d: a2 00 ldx #$0 a02f: 8a txa a030: 60 rts

llvm-mos / llvm-mos

Loop Index Optimization to avoid cmp #343