Heavy WebGL fragment shaders cause major frame drops in the home scene

MozillaReality / FirefoxReality

INACTIVE - A fast and secure browser for standalone virtual-reality and augmented-reality headsets.

https://mzl.la/reality

Mozilla Public License 2.0

770 stars 217 forks source link

Heavy WebGL fragment shaders cause major frame drops in the home scene #1388

Closed MortimerGoro closed 3 months ago

MortimerGoro commented 5 years ago

STR:

Open a WebGL sample with a heavy fragment shader: https://www.shadertoy.com/view/lt2Gzt
Maximize the window size or make it fullscreen
Major frame drops in the home scene. It totally blocks usability, it makes almost impossible to go to another URL.

Oculus Browser has the same problem on both Oculus Go and Quest. It seems we are hitting a max GPU bus threshold and that affects the Timewarp

philip-lamb commented 5 years ago

@fernandojsg @takahirox @jdashg are there any strategic plans for our webgl support that might help address the issue of fragment shaders maxxing out the gpu bus?

kdashg commented 5 years ago

The main issue here is probably "Layers' framerate is capped at the framerate of webgl frames": https://bugzilla.mozilla.org/show_bug.cgi?id=1565360

There's not a lot we can really do to differentiate legit vs malicous high-demand content. We throttle timers/raf for background tabs already.

Trying to implement quotas for various GPU resources just isn't really viable.

Decoupling UI framerate from WebGL framerate will do a lot to make this problem better though. Unfortunately, it's tricky to implement I think.

bluemarvin commented 5 years ago

In FxR the UI is decoupled from the Gecko compositor. So even if the compositor is running at 30Hz, The FxR UI will still run at 72Hz. FxR has it's own render loop that is a separate thread from the Gecko main thread. We also have a UI thread (which is standard Android). From testing this issue really seems to be some limit being exceeded that puts the GPU in a bad state.

kdashg commented 5 years ago

I don't know if there's much that can be done. The only thing we can do is throttle but we can't tell when it's good or bad content.

bluemarvin commented 5 years ago

Would it be possible to decrease the canvas resolution? Perhaps even capping it?

kdashg commented 5 years ago

That may happen to help this case but not the general case.

kdashg commented 5 years ago

We can't really throttle the output size of the app without cooperation from the app, which if we had, would not be a problem here. :)

kdashg commented 5 years ago

I expect apps to respond poorly if we change their backbuffer size without an app-initiated resize.

bluemarvin commented 5 years ago

Even it it keeps the same aspect ratio? I realize there is no good solution. I'm just trying to find any solutions that might help mitigate the problem.

kdashg commented 5 years ago

Some content will handle it but I don't expect most will. It would have to be written with the idea that the resulting size might not match the requested size, and I think most content is not written with that in mind. Maybe it's ok to break them, I don't know.

Generally speaking running heavy GPU workloads will do harsh things to the whole system, just like running heavy CPU workloads. This problem is more or less one as old as computers. It's just a new way to encounter an old problem. I think this is more or less a Harsh Reality.

I think in particular thinking of this as "can we reduce the number of fragments" is only looking for a band-aid for this particular piece of content, but it won't work for other differently-heavy workloads. As such, I don't see resizing as a good general approach.

bluemarvin commented 5 years ago

So, one last thought. Since jank in VR has a huge impact on user comfort, maybe we can detect when the VR render loop starts janking and pause the session? And also show a pop up saying the content is not optimized for the hardware or something?

kdashg commented 5 years ago

You might just choose to lose the webgl context in that case.

bluemarvin commented 5 years ago

We can pause the compositor for a given session from the render loop, so that might be the easiest thing to try first.

bluemarvin commented 5 years ago

Once we go multi window we will probably need a GeckoView API to know if WebGL is running in a session. We might be able to add an API to kill the GL context at the same time.

kdashg commented 5 years ago

For pausing, that should just be jank-based without respect to WebGL, it's just WebGL's the likely reason. (But eventually it might be WebGPU too, and historically it could have been Canvas2D-on-SkiaGL, though not anymore)

philip-lamb commented 5 years ago

@kearwood Do we have a convenient way to detect janking in the VR render loop and to signal that e.g. to our Android layer?

bluemarvin commented 5 years ago

Sure, we want to know when a session is doing GPU related operations, WebRender might add an unanticipated wrinkle. We should probably test that. But for the most part, our render thread has CPU priority so while still possible it is less likely for FxR to jank from session CPU usage.

MortimerGoro commented 5 years ago

I expect apps to respond poorly if we change their backbuffer size without an app-initiated resize.

An alternative solution would be to reduce the Gecko window size. I think apps that relate canvas size to the windows size usually handle the resize event. We could even reduce the window and scale it in the FxR quad, so the window will have the same size for a user but with a worse quality.

This may break CSS element sizes though (e.g. Enter VR button)

MortimerGoro commented 5 years ago

@kearwood Do we have a convenient way to detect janking in the VR render loop and to signal that e.g. to our Android layer?

@philip-lamb This problems happens before entering WebVR, just showing webgl scene on a window, so we also need to detect janking outside of the VR render loop

philip-lamb commented 5 years ago

For now, we have exposed a user-facing performance monitor which detects janking and offers a plausible explanation and user option: #1401