On profiling the normal gameplay loop, it was discovered that several pause menu elements had startlingly high performance costs. This PR optimizes them down to reasonable levels.
The bigger offender was the function _get_joypad_buttons in rebind_option.gd, which was profiled as using 11(!) ms of CPU time in each frame. This function returns an array of arrays of button names, indexed by logical button index and by controller brand. On inspection, it was found that this function was called once for every binding of every action, every frame, and doing the same 25 string translations every time (which suggests 25 disk reads). Assuming one binding for every action--which I know is lowballing it--that's 25 * 25 = 625 total translations per frame, literally 98% of which are repeated work.
Optimizing the script thus becomes a matter of reusing work. This can be done:
within _get_joypad_buttons. Because some buttons share names between brands, some translated strings can be cached before constructing the return array, which skips 15 extra translations per function call. This step alone reduced CPU time from 11ms to 4ms.
between actions and bindings. The result of calling _get_joypad_buttons doesn't depend on which action or mapping it's used for--only the selected locale. By storing the result in a static variable, it can be reused between instances, and the disk reads(?) involved in translation lookup can be skipped for all but the first user.
between frames. Locale really only changes when the user sets it in the options menu, so for 99% of frames, _get_joypad_buttons returns the same result as last frame. rebind_option already has machinery to receive the engine's locale-change notification, so it's fairly trivial to only set _get_joypad_buttons when the game starts or the locale changes. Combined with the previous step, this reduced CPU time from 4ms to ~0.3ms, about level with Entity._physics_process and DejitterGroup._physics_process.
Actually, the majority of rebind_option's state doesn't change between frames. By adding (and connecting to) a signal to the reset-mappings button, it becomes possible to completely remove _process and instead update rebind options reactively. Naturally, this means rebind_option is no longer charting at all on most profiler frames.
Some similar string-caching-via-statics work was also done to get_joypad_motion_name. This is mostly for methodological consistency, but if somebody were to make very strange mapping choices ("wassup YouTube, today I'm gonna be playing SM63R using just the left and right analog stick!"), I could see it saving actual performance.
I also chose to optimize level_info.gd. This was much less of a problem script, but it still seemed appropriate to switch it from detecting locale via _process to detecting via engine notification, especially since rebind_option had shown me how to do that. This saves about 0.15ms--not much, but still seemed excessive for a static menu element.
Some mild refactors have also been sprinkled in, including obligatory reordering functions to standard and commenting everything I touch.
Description of changes
On profiling the normal gameplay loop, it was discovered that several pause menu elements had startlingly high performance costs. This PR optimizes them down to reasonable levels.
The bigger offender was the function
_get_joypad_buttons
inrebind_option.gd
, which was profiled as using 11(!) ms of CPU time in each frame. This function returns an array of arrays of button names, indexed by logical button index and by controller brand. On inspection, it was found that this function was called once for every binding of every action, every frame, and doing the same 25 string translations every time (which suggests 25 disk reads). Assuming one binding for every action--which I know is lowballing it--that's25 * 25 = 625
total translations per frame, literally 98% of which are repeated work.Optimizing the script thus becomes a matter of reusing work. This can be done:
_get_joypad_buttons
. Because some buttons share names between brands, some translated strings can be cached before constructing the return array, which skips 15 extra translations per function call. This step alone reduced CPU time from 11ms to 4ms._get_joypad_buttons
doesn't depend on which action or mapping it's used for--only the selected locale. By storing the result in a static variable, it can be reused between instances, and the disk reads(?) involved in translation lookup can be skipped for all but the first user._get_joypad_buttons
returns the same result as last frame.rebind_option
already has machinery to receive the engine's locale-change notification, so it's fairly trivial to only set_get_joypad_buttons
when the game starts or the locale changes. Combined with the previous step, this reduced CPU time from 4ms to ~0.3ms, about level withEntity._physics_process
andDejitterGroup._physics_process
.rebind_option
's state doesn't change between frames. By adding (and connecting to) a signal to the reset-mappings button, it becomes possible to completely remove_process
and instead update rebind options reactively. Naturally, this meansrebind_option
is no longer charting at all on most profiler frames.Some similar string-caching-via-statics work was also done to
get_joypad_motion_name
. This is mostly for methodological consistency, but if somebody were to make very strange mapping choices ("wassup YouTube, today I'm gonna be playing SM63R using just the left and right analog stick!"), I could see it saving actual performance.I also chose to optimize
level_info.gd
. This was much less of a problem script, but it still seemed appropriate to switch it from detecting locale via_process
to detecting via engine notification, especially sincerebind_option
had shown me how to do that. This saves about 0.15ms--not much, but still seemed excessive for a static menu element.Some mild refactors have also been sprinkled in, including obligatory reordering functions to standard and commenting everything I touch.