Hydr8gon / sm64

A port of Super Mario 64 for the DSi
Creative Commons Zero v1.0 Universal
127 stars 11 forks source link

Performance Optimization: Using Thumb Mode #34

Open AngelTomkins opened 4 months ago

AngelTomkins commented 4 months ago

Currently all the code is stored in main ram, and the arm9 core stalls when the arm7 core is reading from main ram. If the code was compiled in Thumb mode (and possibly with size optimizations, -Os/-Oz) it could require fewer ram accesses to execute the same code, and therefore reduce the performance hit from the low main ram and the audio. From testing I the code size seems to shrink to about 60% of the original size using thumb and -Os. I did not notice a significant change in performance, but did not perform the most detailed of tests, importantly this did not slow down the program at all. This would be a simple change to the Makefile.

AngelTomkins commented 4 months ago

An update on the thumb mode. From my more thorough testing, compiling in arm mode is faster by about 1-2 ms per frame. I think that using thumb mode might be more worth for some functions than others, but without a systematic way to test each function it is overall faster in arm mode. It would be nice if there was a tool that compiled all your code, and compared which is faster in arm vs thumb per function, but I don't believe this is possible due to inlining and LTO.