Closed yangfei103 closed 1 year ago
Looks really nice, definitely +1. This seems superior to the current SSAO in every way - are there any reasons long term to keep the old version around long term? Does GTAO require any features that may not exist on all of o3de's supported hardware? Are there any cases where the current SSAO performs better than GTAO?
Also curious how a low quality GTAO compares to current SSAO in performance and quality - for example, does a low quality GTAO still look better than SSAO and is also cheaper?
Looks really nice, definitely +1. This seems superior to the current SSAO in every way - are there any reasons long term to keep the old version around long term? Does GTAO require any features that may not exist on all of o3de's supported hardware? Are there any cases where the current SSAO performs better than GTAO?
Also curious how a low quality GTAO compares to current SSAO in performance and quality - for example, does a low quality GTAO still look better than SSAO and is also cheaper?
Thank you for your comments. There is no special request for GTAO, in fact, it is also a screen space algorithm like SSAO. We have 5 qualities to adjust for GTAO. We let users themselves make a trade-off between performance and quality. At a low-quality level, I think GTAO won't look much better than SSAO in the current version. As far as I know, there exist some obvious noises at a low-quality level. Of course, we hope the community could help us improve it.
As for performance, GATO actually performs slightly more costly than SSAO even at a low-quality level. I show you a simple performance comparison with SSAO and GTAO on my PC, FYI.
Platform:
SSAO:
GTAO (Low):
Though the current version is not as good as our expectations/goals. We sincerely hope the community could help us to improve it.
Thanks for the detail write up and all the hard work that went into implementing this! However, the base assumption that GTAO is better than O3DE's SSAO is incorrect. It might be true for old implementations of SSAO, but O3DE uses a custom implementation that is both faster and higher quality than GTAO.
Here is comparison using the images you provided: GTAO
O3DE SSAO
Thanks for the writeup @antonmic. This makes it a lot clearer that there are pros and cons to each approach, and there are areas where the current SSAO is definitely superior - especially with regards to GTAO applying too much occlusion behind objects. It would be nice to see comparisons against a ray-traced ground truth so we could objectively compare the techniques.
Thanks for the detail write up and all the hard work that went into implementing this! However, the base assumption that GTAO is better than O3DE's SSAO is incorrect. It might be true for old implementations of SSAO, but O3DE uses a custom implementation that is both faster and higher quality than GTAO.
- GTAO is based on HBAO, which is an older technique that assumes each pixel has an infinite wall behind it and tries to calculate the visibility angle. This is a poor assumption in practice, and leads to small geometry casting strong AO like the chains holding the light. It also creates a dark halo around objects because they case AO much futher behind them than they should, while simultaneously not casting enough AO in front of the object. For example, a pillar touching the ground should cast AO equally all around the pillar, but HBAO casts much more AO behind the pillar and not enough in front. O3DE's SSAO has none of these drawbacks.
- GTAO adds cosine weighting to HBAO for more accurate results. O3DE's SSAO has cosine weighting as well (dot product in the sample accumulation), so both are equal in this regard.
- O3DE's SSAO has a better weighting strategy that preserves detail better for more distant objects.
- O3DE's SSAO does reveal low-res geometry because it calculates normals from the depth buffer (this is faster than writing out normals and then reading them in). This is the biggest drawback of the current implementation, but this can be mitigated by writing out interpolated vertex normals in the pre-depth pass (better quality but less performant) or using the normals from the G-Buffer (means SSAO would be forced to stay after the forward pass, which would limit flexibility. Having SSAO before forward is more flexible because per-material control can be added to reduce SSAO on softer objects like snow).
- Based on the measurements you provide, the low end version of GTAO is almost double the cost of O3DE SSAO (0.108 vs 0.06ms)
Here is comparison using the images you provided: GTAO
O3DE SSAO
@antonmic Thanks for your detailed comments. Definitely, O3DE‘s SSAO has its own advantages in some cases. I agree with @invertednormal that there is no approach is perfect, especially in real-time rendering. Each approach has its own advantages and disadvantages because an algorithm is always based on some assumptions. If an assumption doesn't meet a scene, then the artifact occurs. In our opinion, we suggest keeping both approaches to let users make a choice. As far as I know, many modern engines have more than one kind of AO solution, for example, Unreal supports both SSAO and RTAO.
As for some drawbacks of the GTAO you mentioned, they are true of course. But I think there exist some techniques to relieve them, though we may didn't make it better yet, as a first version. For example:
Thanks for the writeup @antonmic. This makes it a lot clearer that there are pros and cons to each approach, and there are areas where the current SSAO is definitely superior - especially with regards to GTAO applying too much occlusion behind objects. It would be nice to see comparisons against a ray-traced ground truth so we could objectively compare the techniques.
@invertednormal I agree with that. I could provide more comparisons of GTAO and SSAO (including the O3DE's version and the UE's version). And I would further show the RTAO effects of both engines (we also developed RTAO for O3DE) as a reference, FYI.
Scene1: Spaonza O3DE SSAO: O3DE GTAO: O3DE RTAO: UE SSAO: UE RTAO:
Scene2: Simple Blocks O3DE SSAO: O3DE GTAO: O3DE RTAO: UE SSAO: UE RTAO:
BTW, I am not trying and do not intend to show how good the feature we provide is. Actually, there exist some problems in the current version as I mentioned before. All I can do is provide details as possible as I can. So that the community could have a better knowledge of the feature and make an adequate consideration that whether it is potential/proper for the community. If so, I think we could make it better together in the future.
Can you make your GTAO implementation into a Gem? Then you can easily submit it and iterate on it, potentially get others to collaborate on some of the missing pieces.
Also worth noting, some of the improvements you mentioned for GTAO can also be applied to O3DE's SSAO, like a multibounce curve and temporal super-sampling/accumulation. Depending on what you're aiming for this might give you the best results.
Can you make your GTAO implementation into a Gem? Then you can easily submit it and iterate on it, potentially get others to collaborate on some of the missing pieces.
Of course, it couldn't take much effort to migrate the GTAO into an independent Gem. I think it's a good idea.
Approved as long as this feature comes from its own Gem, so the users of O3DE can pick from SSAO component or GTAO component.
Since this RFC is accepted please open a PR and move this RFC to this folder - https://github.com/o3de/sig-graphics-audio/tree/main/rfcs where we will track all the new RFCs for book keeping purposes. Thanks.
Since this RFC is accepted please open a PR and move this RFC to this folder - https://github.com/o3de/sig-graphics-audio/tree/main/rfcs where we will track all the new RFCs for book keeping purposes. Thanks.
Roger that! I'm on my way.
Proposed RFC Feature GTAO (Ground-Truth based Ambient Occlusion)
Summary:
Ambient occlusion (AO) is an important feature in photo-realistic rendering. However, O3DE only provides the SSAO (Screen Space Ambient Occlusion) feature which can not meet all our needs. To make this gap, we developed GTAO (Ground-Truth based Ambient Occlusion) feature for O3DE. The GTAO algorithm, which is first proposed by Activision Blizzard in Siggraph 2016 (FYI, one can refer to this paper and this slides for more details about the algorithm.), can be seen as an enhanced version of SSAO. In a word, the GTAO can achieve better quality with comparable performance cost to the SSAO. Our GTAO implementation is not a full version of Activision’s version, for example, we didn’t support colored occlusion and temporal denoising for now. Despite that, it works well according to our test. Now, we open an RFC feature request here and intend to commit our GTAO implementation to the O3DE community.
What is the relevance of this feature?
In photo-realistic/physically-based rendering, a mesh point is shaded by calculating direct and indirect illumination with diffuse or/and glossy BRDFs according to the rendering equation. The diffuse indirect illumination we talk about here is usually known as ambient light. Ambient occlusion can be interpreted as the visibility for the ambient light, which is an essential feature to improve the realism of rendered images.
In real-time rendering, accurate ambient occlusion is impractical to calculate. For instance, the SSAO is a cheap but coarse approximation to accurate ambient occlusion. Thus it can not always meet the users' needs. To enhance the capability of O3DE in AO, more advanced features are necessary.
As one of the candidates, though GTAO is also a kind of approximation, it is theoretically closer to the Monte Carlo ground truth in algorithm designing. And compared to the SSAO, the GTAO can achieve better quality with a comparable performance cost. By integrating the GTAO into O3DE, users can make a choice between the SSAO and the GTAO according to their needs.
Feature design description:
Architecture
The GTAO is integrated into Atom Gem and AtomLyIntegration Gem following the current post-process pipeline. Based on the SSAO component that O3DE already has, a new AO component that contains both SSAO and GTAO is developed. In overview, the architecture of the new AO component is as follows:
Component Panel
The user can specify an
AO type
in theAmbient Occlusion
panel to enable one of SSAO and GTAO.SSAO:
GTAO:
AO type
SSAO
andGTAO
in this drop box.GTAO Strength
Quality
Radius
Thickness
MaxDepth
Enable Blur
Blur Strength
Blur Edge Threshold
Blur Sharpness
Usage
The
AOParentPass
will control his children, enable the corresponding pass and disable the others, according to theAO Type
fromAOSettings
. For instance, the user adds anAmbient Occlusion
component to the scene. Then he/her selectsGTAO
from theAO Type
drop box. This attribute will be first saved in theAOComponentConfig
, and further passed toAOSettings
byAOComponentController
. In runtime, theAOParentPas
getsAOSettings
from thePostProcessingFeatureProcessor
. TheGTAOPass
will be enabled because theAO Type
matchesGTAO
. Correspondingly, theSSAOPass
will be disabled automatically.Technical design description:
Recap: HBAO & GTAO
Baseline
Before giving implementation details, we make a brief introduction to the GTAO algorithm. The GTAO is first proposed by Activision Blizzard in Siggraph 2016. GTAO is highly related to HBAO (Horizon-based Ambient Occlusion) in algorithm. HBAO assumes a height field around the shading point, in which case the visibility is continuous on the hemisphere (simplify computing). Then they calculate integration on the hemisphere between two horizon lines, which are determined by tracing the height field (depth buffer) in screen space. But its integration equation doesn't correctly match $k_A$ which is deduced from the render equation and ensured to be physically correct. Thus, HBAO cannot promise the same results as the Monte Carlo based ray traced results in theory. To implement ground-truth based ambient occlusion, the GTAO integrates cosine weight term to their integration equation : $$Vd=\frac{1}{\pi}\int\Omega V(\omega_i)(n\cdot\omega_i)\,\mathrm{d}\omega_i=\frac{1}{\pi}\into^\pi\int{-\pi/2}^{\pi/2}V(\theta, \phi)(n\cdot\omegai)|\sin(\theta)|\mathrm{d}\theta\mathrm{d}\phi$$ where the inner integration $$\int{-\pi/2}^{\pi/2}V(\theta, \phi)(n\cdot\omega_i)|\sin(\theta)|\mathrm{d}\theta=IntergrateArc(h_1, h_2, n)$$ can be solved analytically.
Horizon lines $h_1$ and $h_2$ can be found by searching the height field (depth buffer) in screen space.
Similar to the HBAO, the inner integration is solved analytically, and the outer one can be numerically solved by sampling a number of directions around the shading pixel in screen space.
Multi-bounce approximation
To approximate multi-bounce reflection, GTAO models the multi-bounced visibility $V_d^\prime$ as a function of the single-bounced visibility $V_d$ and (neighboring) albedo $\rho$. $$V_d^\prime=f(V_d,\rho)$$ There exists an assumption that neighboring albedo can be approximated with the albedo of the current point being shaded. Then they fit $f$ from data of Monte Carlo ray-traced results with a cubic polynomial function under various albedo: $$V_d^\prime=f(V_d)=((aV_d+b)V_d+c)V_d$$
Implementation details
We implement GTAO following the architecture described previously in O3DE.
Atom
andAtomLyIntegration
Gems are involved. The code tree including both SSAO and GTAO is as follows:Code tree
Note that, for brief reasons, only core codes are listed.
The component is designed as same as SSAO.
AOParent.pass
,GTAOParent.pass
,GTAOCompute.pass
,GTAOCompute.shader
,GTAOCompute.azsl
,GTAOConstant.h
,GTAOParams.inl
,GTAOPasses.h
andGTAOPasses.cpp
are newly added file. In addition,SsaoXXX
files are renamed toAOXXX
respectively.Pass design
Similar to the SSAO pass, the pass for GTAO is:
downsample
->compute
->bluring
->upsample
->modulate
.Shader snippets
What are the advantages of the feature?
As described in previous sections, compared to SSAO and HBAO, GTAO can produce comparable results to that of Monte Carlo ray tracing.
Approximate multi-bounce visibility.
GTAO with a medium quality level performs comparably to SSAO
GTAO time spent 0.41ms in Sponza scene.
O3DE SSAO:
O3DE GTAO:
What are the disadvantages of the feature?
Problem1: Artifacts in some areas.
Problem2: Noise in some areas.
How will this be implemented or integrated into the O3DE environment?
Atom
andAtomLyIntegration
Gems are involved, as described above.Are there any alternatives to this feature?
How will users learn this feature?
Are there any open questions?