optimise ppu rendering - Githubissues

another big speedup can come from rendering the bg from highest priority first to lowest.

as seen here, all 4 of the bg are rendered entirely. some pixels are transparent in bg2, that are intended for bg3 to be used. bg3 still has to have every single tile parsed and then checked if is transparent.

i could instead render bg2 first, then when rendering bg3, at the top of the loop, check if pixel[x] != transparent, if true then continue.

another example: about half of bg0 and bg1 rendering can be skipped.

this introduces a problem however for blending. what happens if pixel with a higher priority wants to blend with the pixel below? well thats simple:

if (pixel[x].opaque()) {
  if (pixel[x].can_blend_with_layer(layer_num)) {
    continue;
  }
}

of course, a layer can be enabled to blend with multiple layers, although it can only blend with 1 at a time, so an extra check is needed to see if that pixel has already been blended.

by doing all of this, it removes that merge() function that i have (very slow) and lots of needless tile fetching and decoding. it also means i can work on 1 pixel buffer, rather than 5 (1 obj, 4 bg) and then merging them. also, i can do the blending within the render function itself as either:

the layer doesnt blend
the layers blends to white / dark (self contained)
the layer blends with the above pixel

the good thing is all of these can be templated like so

enum class Blend
{
    None, // no blending
    Alpha, // blend 2 layers
    White, // fade to white
    Black, // fade to black
};

i would need to test if templating is worth it for this, but i predict that it would be.

i would like to optimise for very common cases like this example:

where every bg is enabled, but bgX (bg0 in example 1, bg1 in example 2) is entirely empty, yet, i still have to fetch and decode 240 titles! i think the only way to solve this is tile caching.

ITotalJustice / notorious_beeg

optimise ppu rendering #46