Implement GTAO as an alternative ambient occlusion method for Godot 3.x

drcd1 commented 3 years ago

Describe the project you are working on

Any 3D game with ambient occlusion.

Describe the problem or limitation you are having in your project

The current AO implementation in Godot 3.x is severely limited and very noisy. It's only useful when the radius specified is very small.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

GTAO provides an estimator for a physically based ambient occlusion, using a closed-form solution to part of the rendering equation to get better results with the same amount of samples from the depth buffer (no. of samples from the depth buffer are often the bottleneck). More info can be found in the paper "Practical Realtime Strategies for Accurate Indirect Occlusion" by Jimenez et al. (2016).

Most alternatives to GTAO consider a form of occlusion based on volume (also called Ambient Obscurance), which is less accurate (and doesn't account for visibility or the cosine-weighted distribution found in diffuse materials), or are much more expensive (i.e.: screen space ray marched AO).

Since the current SSAO implementation works reasonably well for small radii, the idea is to add this as an alternative for people who want higher fidelity rendering.

Here's a comparison between the current AO implementation and my initial GTAO implementation (which doesn't yet account for the fact that the objects thickness is not infinite). Since the idea is for GTAO to provide higher fidelity quality for higher-end hardware, I've matched the GTAO - low settings with SSAO - high settings (however, on my machine the timings between SSAO - low and SSAO - high are indistinguishable).

Below is a comparison between the various AO's. This was taken at roughly 840p on a Geforce 1070 - Max-Q (i.e.: laptop edition). The timings were measured in editor (I'm not sure of how to accurately profile shaders). I've added a complicated GPU particle system to the scene to make sure I could measure the changes (otherwise the time was not GPU limited).

imagem

You can check in the timings that GTAO provides a similar level of noise as SSAO for same-time comparisons, but scales better and approaches a screen-space raytraced solution (i.e.: ground truth)

Here are some extra comparison images: imagem

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

My current implementation implements GTAO as an alternative to SSAO, toggled through #ifdef directives, in shaders/ssao.glsl, heavily based on paper "Practical Realtime Strategies for Accurate Indirect Occlusion" by Jimenez et al. Some modifications, namely to control the "thickness" of the scene, or the sample distribution, may be introduced, similar to https://github.com/GameTechDev/XeGTAO. We should also introduce a radius falloff, so that, for a small radius, GTAO blends better with the rest of the scene.

The user can then specify whether to use SSAO or GTAO as an option of the environment (see the mockup below). In the editor, the settings to adjust should be shown/hidden depending on the type of AO (GTAO has some specific parameters).

imagem

In terms of code, this requires creating the extra parameters in the doc/classes/environment.xml and doc/classes/VisualServer.xml, and changing scene/resources/environment.cpp to account for the extra parameters, and to change the parameters taken by environment_set_ssao() in RasterizerScene (servers/visual/rasterizer.h)and its derivative classes.

Since we are now passing an increased number of parameters to the AO, we also need to add definitions to be able to bind and create commands with these parameters (unless there's a smarter way to do it that I missed).

Overall, this feature would require changes in the following files.

core/class_db.cpp
core/class_db.h
core/command_queue_mt.h
core/make_binders.py
doc/classes/Environment.xml
doc/classes/VisualServer.xml
drivers/dummy/rasterizer_dummy.h
drivers/gles2/rasterizer_scene_gles2.cpp
drivers/gles2/rasterizer_scene_gles2.h
drivers/gles3/rasterizer_scene_gles3.cpp
drivers/gles3/rasterizer_scene_gles3.h
drivers/gles3/shaders/ssao.glsl
scene/resources/environment.cpp
scene/resources/environment.h
servers/server_wrap_mt_common.h
servers/visual/rasterizer.h
servers/visual/visual_server_raster.h
servers/visual/visual_server_wrap_mt.h
servers/visual_server.cpp
servers/visual_server.h

Something which I am not quite sure of how to do is how to hide, in the editor, the extra parameters passed to GTAO when GTAO is not enabled.

GTAO algorithm and sampling

Ambient occlusion simplifies the rendering equation (https://en.wikipedia.org/wiki/Rendering_equation) in the following ways:

with a fixed lambertian BRDF
with a binary illumination term: L(w_i) is either 0 if the direction w_i is occluded by the scene, and 1 if it isn't.

Furthermore, we suppose that the scene can be represented by a height map (i.e.: the depth map)

This allows us to rewrite the rendering equation like so: eqn_fixed

theta_min(phi) and theta_max(phi) are the minimum angles and maximum angles that the surface makes with the horizon

GTAO2 (poor sketch done in paint :) )

In this case, m = 4 (number of red arrows) and n= 8 (number of blue circles x2)

If we know theta_min(phi) and theta_max(phi), the inner integral can be analytically computed. The outer integral is computed with monte carlo.

In order to do this, for every pixel, we sample the depth map along m lines (to estimate the outer integral), n times along each line, to estimate, for each line, the theta_min(phi) and theta_max(phi). The total number of samples is n*m. Increasing m reduces noise, while increasing n reduces bias and banding. Here's an example of the sampling pattern imagem This pattern is randomly rotated for each pixel.

Currently, I've set the following presets:

Low: m = 1, n=16
Medium: m = 2, n = 20
High: m = 3, n = 24

Which most likely should be changed (the value for n might be too big).

I am using the same spatial blur that the current 3.x SSAO uses.

If this enhancement will not be used often, can it be worked around with a few lines of script?

This is part of the core rendering pipeline

Is there a reason why this should be core and not an add-on in the asset library?

This is part of the core rendering pipeline

clayjohn commented 3 years ago

A better SSAO implementation in 3.x would be very welcome. I'm a little concerned about performance of GTAO though. Could you provide some more detailed information about what you propose in terms of number of samples and type of blurring?

Also, have you considered (or are you using) deinterleaved rendering? Deinterleaved rendering should help you increase your sample count without hurting performance.

Also, a spatially-aware blur is pretty much a necessity. We should err on the side of fast results rather than being accurate.

drcd1 commented 3 years ago

A better SSAO implementation in 3.x would be very welcome. I'm a little concerned about performance of GTAO though. Could you provide some more detailed information about what you propose in terms of number of samples and type of blurring?

I have updated my original post with this information, along with a short explanation of GTAO.

Also, have you considered (or are you using) deinterleaved rendering? Deinterleaved rendering should help you increase your sample count without hurting performance.

I hadn't considered it yet. It might be a bit too tricky for me to implement, but I can give it a shot.

Also, a spatially-aware blur is pretty much a necessity. We should err on the side of fast results rather than being accurate.

Agreed. I am currently using the same blur SSAO was already using in 3.x. I showed some results without blur just to compare the noise.

clayjohn commented 3 years ago

I hadn't considered it yet. It might be a bit too tricky for me to implement, but I can give it a shot.

I can give you a hand, you can also take a look at SSAO in master which uses it.

Calinou commented 3 years ago

Adding an option to render SSAO at half resolution would help improve performance significantly on lower-end hardware, both for the current SAO and GTAO. This option is already present in master, but it wasn't backported to 3.x yet.

atirut-w commented 3 years ago

Damn, how'd you find this type of AO?

Calinou commented 2 years ago

@drcd1 Could you upload your current work to a branch (or perhaps open a pull request against the 3.x branch)? In case you no longer have interest or time to work on implementing GTAO, this could help other people continue this work if they need it. Thanks in advance :slightly_smiling_face:

drcd1 commented 2 years ago

I'm so sorry that I haven't done the PR yet. I seriously mismanaged my time in the last month and haven't had time to improve much upon what I had before :/

I'll do the PR tomorrow first thing in the morning.

HydrogenC commented 9 months ago

Will this be ported to 4.x? Or GTAO is already the default AO implementation in 4.x?

atirut-w commented 9 months ago

4.x AO doesn't look quite like GTAO in my opinion.

Calinou commented 9 months ago

Will this be ported to 4.x? Or GTAO is already the default AO implementation in 4.x?

The GodotCon 2023 talk about rendering mentions GTAO support could be added in a future release.

4.x AO doesn't look quite like GTAO in my opinion.

Indeed, 4.x uses ASSAO because it works well if you don't use TAA. However, if you use TAA, GTAO is generally a better choice as it can look better while also being faster. XeGTAO in particular was optimized to run well enough on integrated graphics (that's what the Xe stands for, it refers to the Iris Xe IGP).

jams3223 commented 1 week ago

@clayjohn Here's a new, more accurate, and efficient implementation of ambient occlusion made by Mirko. https://x.com/Mirko_Salm https://www.shadertoy.com/view/XXGSDd

clayjohn commented 1 week ago

@clayjohn Here's a new, more accurate, and efficient implementation of ambient occlusion made by Mirko. https://x.com/Mirko_Salm https://www.shadertoy.com/view/XXGSDd

I beat you to it and updated our Rendering project with a link to that yesterday :) https://github.com/orgs/godotengine/projects/33?pane=issue&itemId=75874346

For 4.x I definitely think this is the way to go (or XeGTAO as it seems to be well-optimized for lower end devices)

godotengine / godot-proposals