semiversus / jeg

NES emulator with focus at hardware abstraction, testability and performance
MIT License
9 stars 2 forks source link

It's better to move unchanged if-then clause out of a loop #8

Closed GorgonMeducer closed 6 years ago

GorgonMeducer commented 6 years ago

In function fetch_sprite_pattern(ppu_t *ppu, int i, int row),

for (int j=0; j<8; j++) { int p1, p2; if ((attributes&0x40)==0x40) { p1=(low_tile_byte&0x01); p2=(high_tile_byte&0x01)<<1; low_tile_byte>>=1; high_tile_byte>>=1; } else { p1=(low_tile_byte&0x80)>>7; p2=(high_tile_byte&0x80)>>6; low_tile_byte<<=1; high_tile_byte<<=1; } data<<=4; data|=((attributes&3)<<2)|p1|p2; }

It's obvious that the result of the if-then clause is unchanged during the looping time, so do it 8 times would be a waster.

This is I suggested:

int p1, p2;
if (attributes&0x40) {
    for (int j=0; j<8; j++) {
        p1=(low_tile_byte&0x01);
        p2=(high_tile_byte&0x01)<<1;
        low_tile_byte>>=1;
        high_tile_byte>>=1;

        data<<=4;
        data|=((attributes&3)<<2)|p1|p2;
    }
} else {
    for (int j=0; j<8; j++) {
        p1=(low_tile_byte&0x80)>>7;
        p2=(high_tile_byte&0x80)>>6;
        low_tile_byte<<=1;
        high_tile_byte<<=1;

        data<<=4;
        data|=((attributes&3)<<2)|p1|p2;
    }
}
semiversus commented 6 years ago

Yes, you're right. These are kind of optimisations I'm thinking about. Currently for each pixel "everything" gets evaluated. I want to reverse the logic and doing only the things needed in the current "state" (e.g. when ppu is not working in the visible area a lot of evaluation can be skipped).

GorgonMeducer commented 6 years ago

Could I keep this change in the next pull request? In terms of local logic, I believe it's safe to use the optimised version.

semiversus commented 6 years ago

Yes, we'll keep it

GorgonMeducer commented 6 years ago

Thanks, you should be able to see it in my second pull request