ARM-software / Arm-2D

2D Graphic Library optimized for Cortex-M processors
Apache License 2.0
279 stars 62 forks source link

Improve Transform performance #54

Closed GorgonMeducer closed 1 month ago

GorgonMeducer commented 1 month ago

The current transform anti-alias algorithm samples four surrounding points in the source image for each point. This delivers the best result for the scaling range from 0.5 to 2.0. By reducing the sampling points from 4 to 3, the performance can be improved by 50% without noticing the colour errors via human eyes (<25% colour error in the worst case). The new algorithm is called triangle sampling:

image

NOTE: The source tile's coordinates (x and y) are represented in q15.8 format (stored as a 32-bit integer). We use the 8-bit fraction part as the alpha.

If we only sample the top-right corner, we can remove the if-else conditional branch from the hot loop, i.e. improving the performance further; it is a question of whether human eyes can spot the differences:

image
GorgonMeducer commented 1 month ago

Improved the original 4xSSAA algorithm (increased performance by 80%) and introduced 2xSSAA.