libretro / gpsp

gpSP for libretro.
GNU General Public License v2.0
51 stars 51 forks source link

Torus Games's GBA games have a noticeable slowdown #234

Open freq-mod opened 6 months ago

freq-mod commented 6 months ago

This affects Doom (II) and Duke Nukem Advance. I didnt test other Torus games yet.

While the rest 3D games (VRally 3, GBA NFS demakes and such) run perfectly on weak devices, even like miyoo (ARM926EJ-S) Doom 2 and DNA have a significantspeed issues, to the point where these games require frameskip. That happens even on mobiles devices with 4 Cortex-A7 cores and Mali GPU. Stronger devices run these games fine, but considering other 3D GBA games run perfectly fine on weak devices, one would expect Torus' software to run as well, which makes me suspect its some kind of an emulator issue.

Sorry if its invalid, Same occurs on mGBA, but to a greater degree.

andymcca commented 6 months ago

These particular two games have well-known performance issues in all gpsp forks. TempGBA / ReGBA by Nebuleon attempted to address it by implementing partial dynarec cache flushing, with some success. I've started attempting to port these changes to lr-gpsp in my own fork, but it's all quite time-consuming and I've been busy with other things recently!

I've spoken with David about it in the past as well. If there's any update in the future I'll post it here.

freq-mod commented 6 months ago

I see. Take some time, will look forward to having this implemented one day

andymcca commented 5 months ago

Did a few tests with this over the weekend - we already know that the issue with Doom 2 / Duke Nukem and the Dynarec is that the self-modifying code causes a high volume of translation cache flushing per frame, usually when in ARM mode. So I did a little experiment - added a check in the code before each frame to see whether we are in ARM or Thumb mode and switch to Intepreter mode or Dynarec respectively.

This gave a big performance increase (>10fps) in both games on my ARM32 device! But slight performance decrease in other games and worse, instability (e.g. Top Gear Rally crashes within 5 seconds of starting a race with this change).

Discussed with David - the instability with this method is because ideally we should always initialise the caches prior to switching into Dynarec mode. The problem with doing that in my experiment is it will likely negate the performance advantage!

But the experiment shows there are gains to be made with this approach (i.e. switching between Interpreter/Dynarec) if we can make the switch more targeted e.g. David said we could possibly specifically interpret certain code blocks which are prone to change via SMC. So it's an interesting area to develop, with the downside being that it's only likely to yield a performance increase in very few games from the GBA library.