gonetz / GLideN64

A new generation, open-source graphics plugin for N64 emulators.
Other
754 stars 174 forks source link

[QUESTION] How does the performance hack called "Disable Framebuffer" work? #2799

Closed Adamillo closed 8 months ago

Adamillo commented 9 months ago

I've been curious to know how this even works, since most GPU's require a framebuffer to even be able to work and render things at all

ghost commented 8 months ago

Turns off render to texture emulation, which in the N64's case is manipulation of in-memory screen buffers. Many many games use this as the cornerstone of special effects. Take for instance any screen warping, motion blurs, shadows, etc. Turning this off is a performance boost since it skips these memory copies, etc. Back in the day I remember when Glide64 had a major performance hit when copying CPU based framebuffer manipulation effects, because PCs back then didn't really have as fast PCIE/memory buses as they do have now. Let alone they used AGP so there was a massive performance hit in some cases when doing N64 emulation work back then in GL.

It was around when I worked on Glide64's GL wrapper that GL based framebuffer effects using GL framebuffer objects (basically, a PC native version of render to texture for OpenGL 1.5) and PBOs (2.1 onwards) where some semblance of speed could be had. It helps now that modern APIs like Vulkan and DX12 have now finally closed that gap, with Vulkan having now direct RAM access with special API functions, so you can do things on CPU and GPU with zero performance hit. The compute shader based version of Angrylion using Vulkan 1.3 is a prime example of this.

Almost all modern games still use render to texture extensively, even more so for anti aliasing like TAA/FXAA, shadow texturing, screen space ambient occlusion/reflections, lens flare, bloom, order independent transparency, etc. Many games use RTT to composite even the general game image together from multiple offscreen render targets, like for deferred rendering, rendering from async Vulkan/DX12 render queues like Doom Eternal, etc.

Adamillo commented 8 months ago

That explains a lot! Thanks for the extensive answer!

Adamillo commented 8 months ago

It helps now that modern APIs like Vulkan and DX12 have now finally closed that gap, with Vulkan having now direct RAM access with special API functions, so you can do things on CPU and GPU with zero performance hit.

Does this mean that you could use Vulkan 1.3 to emulate the unified memory of many consoles without copying things from VRAM to RAM or vice versa without a performance penalty?

Jj0YzL5nvJ commented 8 months ago

It helps now that modern APIs like Vulkan and DX12 have now finally closed that gap, with Vulkan having now direct RAM access with special API functions, so you can do things on CPU and GPU with zero performance hit.

Does this mean that you could use Vulkan 1.3 to emulate the unified memory of many consoles without copying things from VRAM to RAM or vice versa without a performance penalty?

From what I understand, those new APIs are smart enough to know when it is better to use RAM or VRAM. With previous APIs it was extremely easy to screw up and force the use of a context with the worst possible performance with certain GPUs.

See #1561 and #1960 as examples.

Note: @cruduxcru0 refers to ParaLLEl RDP which states that it only needs Vulkan 1.1. The detail is that official support for the VK_KHR_8bit_storage extension is part of Vulkan 1.3. Support for VK_KHR_8bit_storage in implementations prior to Vulkan 1.3 is unofficial. Current versions of ParaLLEl RDP avoid using the VK_KHR_8bit_storage / VK_KHR_16bit_storage extensions on less efficient GPUs, so the original claim is not false either... it can work with Vulkan 1.1, but Vulkan 1.3 is recommended. In layman's terms, just buy NVIDIA.

Adamillo commented 8 months ago

I see, sounds awesome! Thanks for the insight on how things work. I'll close this now, since I'm pretty satisfied with the answers!