indigodarkwolf / box16

A fork of the official X16 emulator, converted to C++20 and with a bunch of features tweaked and added.
MIT License
41 stars 18 forks source link

vera rendering: add some low hanging performance fruits #96

Closed pontaoski closed 1 year ago

pontaoski commented 1 year ago

Since there are only 2 layers and 4 colour depths, generating 8 functions is reasonable in regards to code size and allows some minor performance improvements to the VERA line drawing code.

The most performance improvement comes out of templating the layer selection, which removes the indirection regarding layers, and turns complex address shenanigans into simple offsets.

While not as pronounced, templating the bpp code removes some loads in hot loops and is easy enough to do.

Overall, low effort, low but still respectable gains.

indigodarkwolf commented 1 year ago

Yup, I'll take it. Thanks!