libretro / snes9x2002

Snes9x 2002. Port of SNES9x 1.39 for libretro (was previously called PocketSNES). Heavily optimized for ARM.
37 stars 46 forks source link

Broken on armv7 (rpi2) when enabling ASM_CPU=1 ASM_SPC700=1 #13

Closed joolswills closed 8 years ago

joolswills commented 8 years ago

Recently changes were made to make pocketsnes work on non arm platforms but it looks as through some issues have been introduced with arm.

When enabling ASM_CPU=1 ASM_SPC700=1 on top of ARM_ASM=1 games seems to freeze or have no video. With super metroid it gives a black screen with some audio then some background colours etc.. Super Mario world sometimes shows the "nintendo presents" and then nothing. The sound is also not correct (another issue that needs bisecting)

It used to work on armv7 (the options were not needed as they were enabled by default afaik)

On armv6 (rpi1), video does work with ASM_CPU=1 ASM_SPC700=1 enabled at least on super metroid, but the audio is wrong which is another issue I will bisect separately (some revisions work on armv7, but also have the audio problem)

I am compiling with

make clean; make ARM_ASM=1 ASM_CPU=1 ASM_SPC700=1

It is difficult to bisect as there are some revisions where video is known not to be working, and others where there are build problems.

joolswills commented 8 years ago

I will attempt to provide more information and see if I can debug the issue

joolswills commented 8 years ago

I also see that many gcc flags have been added since 79f6be55360028a68b48d69d7a922da69eef9aa5 (which is before the recent work started).

It used to build with (linux on rpi)

g++ -I. -O3 -DHAVE_STRINGS_H -DHAVE_STDINT_H -DHAVE_INTTYPES_H -D__LIBRETRO__ -DINLINE=inline -DUSE_SA1 -fomit-frame-pointer -Wall -W -Wno-unused-parameter -Wno-parentheses -Wno-write-strings -Wno-comment -DNDEBUG=1 -fPIC

and now builds with

cc -fno-builtin -fno-exceptions -ffunction-sections -fomit-frame-pointer -fgcse-sm -fgcse-las -fgcse-after-reload -fweb -fpeel-loops -O3 -DNDEBUG=1 -DASM_SPC700 -DASMCPU -DARM_ASM -DRIGHTSHIFT_IS_SAR -finline -fsigned-char -fomit-frame-pointer -ffast-math -fstrict-aliasing -ffast-math -funroll-loops -fomit-frame-pointer -DHAVE_STRINGS_H -DHAVE_STDINT_H -DHAVE_INTTYPES_H -D__LIBRETRO__ -DINLINE=inline -DUSE_SA1 -fomit-frame-pointer -Wall -W -Wno-unused-parameter -Wno-parentheses -Wno-write-strings -Wno-comment -Werror=implicit-function-declaration -fPIC -I./libretro -I./src -I.

not sure why we would suddenly need things like -ffast-math for example (and there are a lot of duplicated flags) unroll-loops also may not be an optimisation we want.

aliaspider commented 8 years ago

ASM_CPU and ASM_SPC700 are only kept there for experimentation, and in case someone ever wants to improve the current asm cpu and spc code. the code enabled by those flags was never actually active before. using those will reduce compatibility considerably and will not bring any performance gains, so leave them disabled by default, and just use ARM_ASM=1 which will enable all the stable arm optimizations there is in pocketsnes.

joolswills commented 8 years ago

oh right - I didn't realise that - I had reports of performance decrease on armv6 from a retropie user since the changes, and from a glance I thought the ARM_CPU and ARM_SPC700 used to be switched on. I tested and it seemed quicker with the old code on my rpi1 also with 79f6be55360028a68b48d69d7a922da69eef9aa5

aliaspider commented 8 years ago

maybe enable ARM_SPC700 only ? I think that one was used afterall, I remember sound effects being completely horrible before :P. the difference in performance last time I measured it was less than 5%, so I still think it isn't worth the huge drop in audio quality it introduces. ASM_CPU was definetly off.

joolswills commented 8 years ago

And regarding the flags ? -fweb is enabled by O3 anyway. as is -fgcse-after-reload (but not gcse-sm -fgcse-las). Also the additional of unroll-loops (which could slow things down on cpus with smaller caches) and the addition of fast-math - is there evidence these are needed or they used to be in some old makefile for another platform? (Often I see for devices like gp2x etc people using all sorts of gcc flags which are not needed - because they don't understand they are enabled by default but thought it would help or are plain guessing).

aliaspider commented 8 years ago

what flags ? didn't you remove them already in #12 ?

joolswills commented 8 years ago

I removed some. there are many more in the makefiles that I would question:

-fomit-frame-pointer is included in about 4 places (just cosmetic but one is enough - there is one at the end of Makefile.common that can be left in) -fgcse-sm -fgcse-las -fgcse-after-reload -fweb -fpeel-loops is used for "unix" target. -ffast-math included twice (I dont think it should be added unless there is noticeable improvement as it can cause problems in some cases) and -funroll-loops (I don't think this one should be included - we should leave it to GCC with -O3)

aliaspider commented 8 years ago

well I just used an existing makefile from a different libretro core which had those. the older makefile was archaic and didn't scale well when adding more targets. I don't have an arm dev board to test on, I mainly test the android and 3DS targets, if you think those flags are decreasing performance, just send a PR to remove them. the other flags like -funroll-loops were used in the original makefiles, I just copied everything in there during the refactors.

joolswills commented 8 years ago

will do

with ARM_CPU off and ARM_SPC700 on, video is good - but audio is broken/glitchy (contains whitenoise etc). But previously this sounded ok with the older code.

joolswills commented 8 years ago

will experiment with some flag changes etc.

aliaspider commented 8 years ago

hmm, I don't know then, the older code before the refactors only worked on arm devboards, so it is kinda impossible for me to isolate this one.

I was under the impression that audio was bad before, if it wasn't, then either it wasn't using the asm spc code and something else is causing the performance drop, or it was actually using it but something with it broke during refactors.

joolswills commented 8 years ago

I will bisect the audio issue to get more of an idea. Thanks for the help.

joolswills commented 8 years ago

As you mentioned - it wasn't using the asm spc code before - I bisected it to 39b406dfb5e4a5e64fd790ca6a3f73c5a54b03cb

I will do some more testing re performance. I'll close this. Thanks.

joolswills commented 8 years ago

I think one of the reasons I thought the sound was changed is the addition of the interpolation. not sure if I prefer it with or without - some audio sounds better with but some sounds clearer without. Is it optional ?

aliaspider commented 8 years ago

it can be optional, but I don't see why, things sound too awful without it, and it doesn't justify the performance gains. but yeah it can very well explain why older versions were running a bit faster than now. that settles the mistery then :) test the firing sound in megaman X for example, with and without interpolation. especially the concentrated fire, you'll see why I didn't bother with the core option. if this is really important for you, I can make it a build option if you want.

joolswills commented 8 years ago

agreed - after some more testing it's def better with interpolation.

I created a PR for the gcc flag changes, which I think are more optimal for when you have a variety of targets

aliaspider commented 8 years ago

thx, merged.