llvm-mos / llvm-mos

Port of LLVM to the MOS 6502 and related processors
Other
406 stars 45 forks source link

Redundant copy and spilling #460

Open AGPX opened 2 months ago

AGPX commented 2 months ago

Hi, I'm playing with the following C++code:

class Actor
{
public:
    Actor() : sprIx(255) {}
    unsigned short x;
    unsigned char y, sprIx;
};

class Player : public Actor
{
public:
    Player() : Actor(), __state__(0) {}
    unsigned char __state__;
};

Compiled with -O3 and with the latest version of the compiler:

    .section    .text._ZN6PlayerC2Ev,"axG",@progbits,_ZN6PlayerC2Ev,comdat
    .p2align    1                               ; -- Begin function _ZN6PlayerC2Ev
    .type   _ZN6PlayerC2Ev,@function
_ZN6PlayerC2Ev:                         ; @_ZN6PlayerC2Ev
; %bb.0:
    ldx __rc2                        ; <---- Why copy rc2,rc3 in rc4,rc5? rc4 and rc5 are of no use!
    stx __rc4
    ldx __rc3
    stx __rc5
    ldx __rc4                                                         <---- Why use rc4 instead of rc2?
    stx mos8(.L_ZN6PlayerC2Ev_zp_stk)   ; 1-byte Folded Spill             <---- Why spill? rc4 is unchanged!
    ldx __rc5                                                         <---- Why use rc5 instead of rc3?
    stx mos8(.L_ZN6PlayerC2Ev_zp_stk+1) ; 1-byte Folded Spill             <---- Why spill? rc5 is unchanged!
    jsr _ZN5ActorC2Ev
    ldy #4
    lda #0
    ldx mos8(.L_ZN6PlayerC2Ev_zp_stk)   ; 1-byte Folded Reload         <---- Why reload? rc2 is unchanged!
    stx __rc2
    ldx mos8(.L_ZN6PlayerC2Ev_zp_stk+1) ; 1-byte Folded Reload         <---- Why reload? rc3 is unchanged!
    stx __rc3
    sta (__rc2),y
    rts
.Lfunc_end6:
    .size   _ZN6PlayerC2Ev, .Lfunc_end6-_ZN6PlayerC2Ev
                                        ; -- End function
    .section    .text._ZN5ActorC2Ev,"axG",@progbits,_ZN5ActorC2Ev,comdat
    .p2align    1                               ; -- Begin function _ZN5ActorC2Ev
    .type   _ZN5ActorC2Ev,@function
_ZN5ActorC2Ev:                          ; @_ZN5ActorC2Ev
; %bb.0:
    ldy #3
    lda #255
    sta (__rc2),y
    rts

It could be hugely reduced to:

.section    .text._ZN6PlayerC2Ev,"axG",@progbits,_ZN6PlayerC2Ev,comdat
    .p2align    1                               ; -- Begin function _ZN6PlayerC2Ev
    .type   _ZN6PlayerC2Ev,@function
_ZN6PlayerC2Ev:                         ; @_ZN6PlayerC2Ev
; %bb.0:
    jsr _ZN5ActorC2Ev
    ldy #4
    lda #0
    sta (__rc2),y
    rts
.Lfunc_end6:
    .size   _ZN6PlayerC2Ev, .Lfunc_end6-_ZN6PlayerC2Ev
                                        ; -- End function
    .section    .text._ZN5ActorC2Ev,"axG",@progbits,_ZN5ActorC2Ev,comdat
    .p2align    1                               ; -- Begin function _ZN5ActorC2Ev
    .type   _ZN5ActorC2Ev,@function
_ZN5ActorC2Ev:                          ; @_ZN5ActorC2Ev
; %bb.0:
    ldy #3
    lda #255
    sta (__rc2),y
    rts

Is there some way to avoid this?

mysterymath commented 2 months ago

It couldn't be reduced to that, at least not trivially. llvm-mos doesn't have an interprocedural register allocator; the set of registers used by Actor's constructor aren't available when Player's constructor is being compiled. To make that information available is actually pretty tricky if you think about it, since it would mean compiling functions in a very specific order. (There's a way to do this in LLVM, but it isn't very well maintained, and we haven't been able to turn it on yet.)

Accordingly, Player's constructor must assume Actor can clobber any caller-saved register, including rc2. It's thus not legal to refer to rc2 after the call. That's the reason for the spill.

As for why the spill does a redundant copy through rc4 and rc5, I'd suspect it's the usual "register allocation weirdness", but it's hard to tell. I'm admittedly not spending much time on most optimization issues these days; code gen quality isn't the most reported reason why llvm-mos is unsuitable for a given kind of project anymore. But eventually that will once again be the case, so I'll leave this open until I or someone else can get around to cleaning up all of these cases (as far as is possible).