batocera-linux / batocera.linux

batocera.linux
https://batocera.org
Other
1.94k stars 500 forks source link

Flycast performance regressions #7214

Closed Hew-ux closed 1 year ago

Hew-ux commented 1 year ago

Issue: Flycast has been lowering in performance on the latest versions of Batocera (v33, v34, v35). This is global, affecting both x86_64 and RPi 4, and users have been noticing it too. The game I've noticed this on the most is Bangai-O, which used to run full speed, but now during gameplay seems to slow down and sometimes even completely freeze.

Reportedly, the game Crazy Taxi used to run at full speed on the Pi 4 with all default settings. Now it has sporadic slow downs and even after setting down other settings like resolution and decorations, it is still running slow at times.

This one is hard to reproduce as you need intentionally weak hardware. On x86_64, this is very variable, so probably not worth testing there. But the Pi 4 does have consistent specs, you can (reportedly) roll back to v31 and have full performance there.

Possibly, this could just be a result of the emulator itself becoming more accurate and demanding. It seems a bit extreme to lose this much performance, though.

Possibly related to this issue where the GPU is not properly being utilised: https://github.com/batocera-linux/batocera.linux/issues/6623

dmanlfc commented 1 year ago

Possibly, this could just be a result of the emulator itself becoming more accurate and demanding. It seems a bit extreme to lose this much performance, though.

likely this, you should report it upstream & see what flyinghead has to say. there is nothing compilation wise we can really optimise.

you could take the old v31 flycast.mk file & compile it into v36 to verify if performance is consistent.

Snorgorama commented 1 year ago

Just to add to this thread, I’m also seeing the same performance issues with flycast on rpi4 (4GB) v35. Bangai-O is completely unplayable and all titles seem to be affected anywhere from very slightly to drastically.

My previous install of v33 did not have these issues (although OP is probably correct that performance dropped from v32 to v33, I just can’t confirm that).

Hew-ux commented 1 year ago

@dmanlfc I tried doing just as you suggested and saw an instant improvement to speed. Bangai-O runs at full speed on my "weak" PC and Pi 4.

For x86_64; I grabbed the libretro core for flycast from Batocera v32 and put it into /usr/lib/libretro, overwriting the old file: flycast_libretro.zip

Although it doesn't matter as much on PC, we should seriously consider using the older version for the Pi 4 build, as the performance difference is night and day. I severely underestimated how powerful the Pi 4 was because of this.

dmanlfc commented 1 year ago

@dmanlfc I tried doing just as you suggested and saw an instant improvement to speed. Bangai-O runs at full speed on my "weak" PC and Pi 4.

OK that tells me 3 things:

  1. The older mk file use an older version of the code, therefore the performance issue could still be with flycast - did you speak to flyinghead?
  2. The older version isn't impacted by the newer v36 infrastructure so it's not batocera's supporting OS packages.
  3. The mk changed from make to cmake which may have lost some performance tuning, we could investigate...

@Hew-ux can you run up another linux distro like manjaro on the pi4 with their packaged version of flycast & test? THAT will tell me to look at 3 or you need to do 1.

Hew-ux commented 1 year ago

I installed Manjaro on the Pi. Went to the package manager to download retroarch (there's no "packaged version of flycast"), and then it crashed. Pretty cool. I'll try again from scratch.

In the meantime, I'm going to make a PR to use the old version of Flycast. It will at least tide over users until we can figure out what's causing this performance regression. I think a good way to do it would be to have libretro-flycast use the old version, and have standalone be the current one. That makes the most sense to me, and is how other libretro + standalone core situations handle it anyway.

Edit: Unfortunately that didn't work. The older libretro-flycast, using the makefile from v32, does not read any save files, rendering it useless. But, I also tried compiling the standalone, and instead of compiling it by using the v32 makefile, instead the only thing I changed was the version number from v2.0 to v1.0. It still has the slowdown, so that is suggesting that it is the makefile itself which it causing it. I'll try compiling the standalone with the old makefile as well, just to confirm.

Hew-ux commented 1 year ago

Alright building with the old makefile for standalone also has the slow-down, indicating external reasons. Now I don't know what to think, unless my build process somehow just made Flycast v2.0 again.

urmst commented 1 year ago

Just wanted to speak to this issue. I'm using a Pi 400 with 'insane' overclock at 720p in es.resolution.max. I have no major issue with Dreamcast on Batocera 35, but there are caveats. Firstly, I set the emulator to run at 640x480 render resolution and turn off bezels. I imagine most people do this. Secondly I set the video mode to 640x480 also. This makes a massive difference in performance imho, with either Flycast option. Bangaio runs fine as do many others, even Dead or Alive is impressive. There are other options such light frame skipping that can help as well as ensuring shaders are disabled etc.

There are outliers. Daytona USA for example still struggles a bit. But for the most part performance seems better to me than previous releases. A good example is Tennis 2k2. On older releases you can see the intro graphics warping and this happens in game too when there is z buffering. In the new versions this has gone, which to me speaks to the fact that Flycast must have seen accuracy improvements in recent development. A system like the Pi will find this a little harder, hence the concessions to be made when emulating.

Personally I find this a better trade-off than reverting to an older version with graphical issues as seen on games like Tennis 2k2, and I while I hope Flycast does undergo some optimisation to improve the Rpi4 setup, as it is I would rather Batocera didn't go backwards to the older releases. Just my 2c as a user.

Hew-ux commented 1 year ago

@urmst With custom settings, yes this can be worked around. I'm talking explicitly about the default settings, the performance has dropped using them since v32.

This is an issue on lower-end x86_64 PCs as well, of which I have a test unit specifically for. I assume most x86_64 users are indifferent towards it as their machines are powerful enough to power through it.

We would optimally want to discover the issue and fix it at its core. But, it doesn't make sense to include a core on a particular SBC build if it does not run games well, Batocera has a policy of only including the cores that "matter", which run well (this used to be a hard requirement in the past, but has been loosened up). A roll back for those SBCs would make the most sense without removing the whole emulator, if the cause cannot be discovered.

There is hope though, in that some additional modifications have been made to the makefile of which the repository has been updated with (thanks to dman). I'll be compiling a fresh build of this in order to test it and see what comes of it.

dmanlfc commented 1 year ago

Tagged - cannot fix - this needs to be resolved upstream from the flycast devs.

As per Discord - I built flycast on my pi4 / manjaro OS - same problem with Bangai-O.

For the RPI4 & x86 you have redream as an alternative.

direngrey31 commented 1 year ago

@dmanlfc , hello, I did tests on batocera v35 and dev36 with a pi4 for the Dreamcast and Naomi with the core libretro flycast V33! the tests are conclusive it's super fluid

dmanlfc commented 1 year ago

@direngrey31 use the energy to advise the developers of flycast, not here. if we go backwards we loose naomi2 support & some compatibility.

direngrey31 commented 1 year ago

@dmanlfc

hello it was just a suggestion! you will not lose the exchange the Naomi 2 system does not work of course raspberry pi 4. it is better to favor dreamcast support 🙂

Snorgorama commented 1 year ago

Recognizing that the performance regression appears to come from upstream - Is keeping an older version of the flycast core on the rpi builds a possibility, at least until it is addressed upstream (assuming it ever is)? Many systems already have multiple emulator options, this wouldn’t exactly be out of the ordinary.

While redream is an option, it is far from ideal, particularly as I had a ton of remaps and tweaks made in flycast on a lot of titles. Additionally, I’ve run into some titles that are not running well in redream. In short - more emulator choices is better! :)

wn2000 commented 1 year ago

I noticed the performance regression as well. And I want to add that the old .mk file in batocera actually pointed to a different repo than it does now: Old mk: https://github.com/libretro/flycast Current mk: https://github.com/flyinghead/flycast I see in the libretro repo it was @inactive123 backporting stuff from the upstream from time to time. But it seems the repo has been "inactive" for quite some time.

Hew-ux commented 1 year ago

To keep everyone in the loop, the performance regression was reported to flycast directly here: https://github.com/flyinghead/flycast/issues/818

Flycast dev is aware and has submitted a commit which may address it. So hopefully, it will just be a matter of the package being bumped in Batocera, if that was the true cause of the performance regression. Edit 8/12/22: Unfortunately, that was not the issue, so it's still under investigation.

Diegolgo commented 1 year ago

@dmanlfc I tried doing just as you suggested and saw an instant improvement to speed. Bangai-O runs at full speed on my "weak" PC and Pi 4.

For x86_64; I grabbed the libretro core for flycast from Batocera v32 and put it into /usr/lib/libretro, overwriting the old file: flycast_libretro.zip

Although it doesn't matter as much on PC, we should seriously consider using the older version for the Pi 4 build, as the performance difference is night and day. I severely underestimated how powerful the Pi 4 was because of this.

this was the solution for me. c2d e8400 running almost like the performance of 5.29 (is not the same, but almost...) but now with v36 supporting lightguns... ;) thanks!