hrydgard / ppsspp

A PSP emulator for Android, Windows, Mac and Linux, written in C++. Want to contribute? Join us on Discord at https://discord.gg/5NJB6dD or just send pull requests / issues. For discussion use the forums at forums.ppsspp.org.
https://www.ppsspp.org
Other
11.01k stars 2.15k forks source link

[OpenGL] Very slow effects [Grisaia no Kajitsu] #16307

Open Nabokov86 opened 1 year ago

Nabokov86 commented 1 year ago

Game or games this happens in

ULJM06233 - グリザイアの果実 -LE FRUIT DE LA GRISAIA

What area of the game

When the fade transition effect occurs, the game becomes very slow.

Speed seen in PPSSPP

4% (3/60)

GE frame capture and debug statistics

ULJM06233_00006 ULJM06233_00007 GE frame: ULJM06232_0001.zip

Platform

Android

Mobile phone model or graphics card

Mali-G31

PPSSPP version affected

v1.13.2-1705

Last working version

-

Graphics backend (3D API)

OpenGL / GLES

Any other notes or things you've tried

Runs very smoothly in Vulkan rendering mode.

Checklist

sum2012 commented 1 year ago

Please create a frame dump https://github.com/hrydgard/ppsspp/wiki/How-to-create-a-frame-dump

Nabokov86 commented 1 year ago

GE frame: ULJM06232_0001.zip

@sum2012 I've already done that.

unknownbrackets commented 1 year ago

This scene draws a series of 246 horizontal strips to produce a faded effect. The right way to do this would probably have been using vertex colors (they already used modulate) and simple alpha blending, but alas.

The later draws use FIXED + FIXED blending. This should be translated to ONE + FIXED with a shader uniform... and at least for me, this is exactly what happens. It's definitely not efficient that it does 246 separate draw calls, each with a uniform update in between, but I'm surprised it's so significantly slow.

-[Unknown]

Nabokov86 commented 1 year ago

It's surprising how there is no slowdown with Vulkan backend. Very smooth 100% speed.

Vulkan OpenGL

hrydgard commented 1 year ago

If you load up the GE frame dump yourself as if you loaded a game, does the slowdown reproduce?

Nabokov86 commented 1 year ago

@hrydgard I can reproduce the slowdown when running the dump. But the result is a bit strange. I'm not sure.

For some reason the dump runs very slowly even with the Vulkan backend. And the speed is exactly the same as with the OpenGL. It's weird, it shouldn't be like this.

And also the speed is different. When I run the dump, the speed is about 16%. But in the game about 3-6%.

unknownbrackets commented 1 year ago

The frame dump has a different performance problem. Even on PC it is slow.

For some reason, it's marking the texture as "dirty" each time, so PPSSPP is recopying the 512x512x4 texture data for every single one of those 246 draws. That's 246 MB of copying per frame, and takes its toll. The texture data is identical so this is some kind of bug.

But in any case, that it's faster (16%) may indicate something isn't being captured by the frame dump which is making the actual scene slower...

-[Unknown]

unknownbrackets commented 1 year ago

After #16321 is merged, try this frame dump: #16307_ULJM06232_grisaia_slow_edit.zip

I suspect this one will be fast for you in both OpenGL and Vulkan. If that's the case, this might be a performance issue the frame dump didn't capture - there might be a readback or other operation, perhaps? Or something about how it's stalling.

-[Unknown]

Nabokov86 commented 1 year ago

After #16321 is merged, try this frame dump: #16307_ULJM06232_grisaia_slow_edit.zip

@unknownbrackets No changes. Speed is still about 16%.

But I'm pretty sure the dump didn't capture the issue and it's slow because of something else. It is definitely not right that the speed with OpenGL and Vulkan is exactly the same.

Nabokov86 commented 1 year ago

In case you want to test it in-game, here are the save files. Grisaia savedata (slow effects).zip

Press here to load save data. ULJM06233_00018

Then press the circle button to play the transition effect.

Nabokov86 commented 1 year ago

By the way, I completely forgot to mention. I also tested on AMD graphics, and I did not notice any significant slowdown.

So maybe the problem is specific to android, or maybe the desktop GPU is powerful enough to handle it. Not sure.

unknownbrackets commented 1 year ago

@unknownbrackets No changes. Speed is still about 16%.

Strange. The edited frame dump runs at 1800% (GLES) and 2000% (Vulkan) speed on my phone, though it's Adreno not Mali. That's with the latest git build.

Unfortunately I don't have this particular game. But that it's fine in Vulkan (outside the frame dump) makes it hard to think of anything it could be outside graphics rendering or graphics driver. Nothing is really showing up in the debug statistics that takes that long either.

-[Unknown]

Nabokov86 commented 1 year ago

@unknownbrackets I'm really sorry for taking your time, I think I initially tested the version before these changes. I'm so sorry 🙇‍♂️

I confirm, now the edited dump runs without slowdown, 350% with OpenGL and 500% with Vulcan. The original dump is still slow.

Nabokov86 commented 1 year ago

Just for comparison, the game runs at about 1000% and drops to 3-4% when the transition effects occur. The original dump runs at 16%. The edited dump runs at 350% speed

hrydgard commented 1 year ago

Does the setting "Skip GPU Readbacks" change performance (or change the graphical results)?

Nabokov86 commented 1 year ago

@hrydgard No change in performance, graphics look correct.

unknownbrackets commented 1 year ago

I guess it has to be something different happening than what the framedump does, perhaps with stalls/flushing. Maybe the GL driver is building a new pipeline for each constant FIXED value from the shader uniform? Not sure why it wouldn't replicate in the frame dump if it was that simple, though...

-[Unknown]

Nabokov86 commented 1 year ago

I tested it now with software rendering (OpenGL). There is no slowdown.

The game runs at 50% speed (90% if you hide the text window). During transition effect the speed increases to 95%, because the text disappears.