Duplicate render and hidden area mask

Nordskog commented 10 months ago

What

Various minor graphical performance improvements, all in all saving you 1.0~1.5ms depending on settings and locale. This PR includes an updated SteamVR_Standalone_IL2CPP.dll with changes from https://github.com/DSprtn/SteamVR_Standalone_IL2CPP/pull/7

Fix the Hidden Area mask
- Removal SteamVR Standalone mesh
- Render 1 depth before gbuffer so time isn't wasted drawing other geometry there
- Ensure stencil is empty so deferred shader discards area
- Render deferred shader to Visible Area Mesh instead of quad
Reuse textures rendered for first eye when rendering second eye
- Reuse shadows ( Rendered from point of view of light, unchanged )
- Reuse visor liquid effects ( same for second eye )
- Reuse Clusters ( No idea but outputs a ComputBuffer that doesn't change )
- Reuse Terminal graphics ( Rendered constantly, same for both eyes )
- Reuse WindVolume ( punches hole in fog for repellers/turbines, outputs ComputBuffer that doesn't change )
- Skip Damage Feedback ( red damage vignette, visor screen deformations when infected ) when in neither of those states.

How

Hidden Area Mask

Existing setup

The SteamVR Standalone library adds a normal MeshRenderer with the Hidden Area Mask Mesh using the steamvr_hiddenarea.shader shader. All this shader normally does is clear SV_TARGET and SV_DEPTH. Since it renders at the very beginning of theGBuffer pass, as a MeshRenderer, all it accomplishes is is filling the stencil with the same value ( 0xC0 ) as the other opaque geometry, which is what is ultimately responsible for the pixels being considered during the deferred final pass.

e1ab5c1 attempts to rectify this by drawing the mask a second time AfterGBuffer using the same shader. This clears SV_TARGET and SV_DEPTH Benchmarking with and without the second draw results in no difference in frame times between the two, suggesting the stencil being filled with 0xC0 is the only requirement for the deferred final pass to consider the pixels.

mask_old

Changes

SteamVR_VRRendererSetRenderHiddenAreaMask(false) is called to prevent SteamVR Standalone's existing mask from being rendered.

Our manual draw of the mask is moved to BeforeGBuffer using a new shader that

Fills SV_Target with float(0,0,0,1)
Fills SV_Depth with 1

Since it is drawn manually using the CommandBuffer, it will not automatically fill the stencil buffer.

Shader code

```cpp Shader "GTFOVR/HiddenArea" { SubShader { Tags { "RenderType" = "Deferred" } ZTest Always Cull Off Pass { Fog{ Mode Off } CGPROGRAM #pragma vertex vert #pragma fragment frag #include "UnityCG.cginc" struct Attributes { float4 position : POSITION; }; struct Varyings { float4 position : SV_POSITION; }; struct FragmentOutput { float4 diffuse : SV_Target; float depth : SV_Depth; }; Varyings vert(Attributes v) { Varyings o; o.position = v.position; return o; } FragmentOutput frag(Varyings i) { FragmentOutput o; o.diffuse = float4(0, 0, 0, 1); o.depth = 1; return o; } ENDCG } } } ```

Since the mask fills the depth buffer BeforeGBuffer, no time will be wasted drawing world geometry into the masked area, and the stencil being blank prevents the deferred final pass from considering the pixels.

The deferred final pass ( Clustered\ClusterDeferred ) doesn't actually use the depth directly, but masks it by the stencil before mapping it to the red channel of another RenderTexture, which is what it finally consumes, for whatever reason.

Visually, there will be no difference to the output, but performance should improve by about 0.5ms. The masked area will be filled with fog, rendered as if there is no geometry in the area.

Fog is a 3DTexture output from a ComputeShader at an earlier stage, so all the Deferred final pass does is blend between the slices according to the depth. This is fairly cheap, but there is some performance to be gained from skipping this too.

Clustered\ClusterDeferred is rendered onto a quad mesh obtained from RenderUtils.quad which caches its mesh in RenderUtils.s_quad. Replacing this mesh with the Visible Area Mesh allows us to skip any computation in the hidden area. We obtain the Visible Area Mesh using the newly-added SteamVR_CameraMask.getVisibleAreaMask(SteamVR, EVREye), and flip the mesh back and forth for each eye in InjectDeferredRenderMesh. If the mesh is empty we leave the original quad mesh

The quad, or our Visible Area Mesh is rendered onto a temporary RenderTexture. As luck would have it, this same RenderTexture gets reused for both eyes, and for later for drawing markers and such. The previous eye lingering is no problem, but in the of the RenderTexture not covered by either eye, markers will accumulate bright white pixels over time.

For Native PCVR users this is probably not a problem, but when streaming to a Standalone VR device ( e.g. Quest ), local reprojection will often result in the "hidden area" becoming visible.

https://github.com/DSprtn/GTFO_VR_Plugin/assets/8961771/23e02445-60a7-4c6c-a52a-a80410cbe761

The area outside the projected texture is expected to be black, so we want our hidden area to be filled with black too.

Since we have no easy access to blank the texture, and just drawing it after FinalPass no work, the simplest solution ended up being drawing the Hidden Area Mesh again BeforeImageEffects. This uses the same CommandBuffer as before, and is the reason we go to the effort of having the shader output black to SV_TARGET.

mask_new

Texture reuse

A number of textures are generated every frame at runtime, before being slapped onto geometry or blipped onto the screen. These are generally the same for both eyes, and the RenderTexture they output to is kept around, so we can simply skip updating them for the second eye, and it will reuse the previously generated texture.

This applies to

Visor liquid effects
Cluster ( Outputs a ComputBuffer that doesn't change )
Terminal graphics
WindVolume ( punches hole in fog for repellers/turbines, outputs ComputBuffer that doesn't change )

Liquid effects and Terminal Graphics we add to the CommandBuffer ourselves in VRRendering, so we can just skip them as necessary. The ConfigCameraBlood setting can visually remove liquids effects from the output by setting UI_Apply.enabled to false, but this does not prevent the game from going to the effort of generating these textures every frame, so both eyes will now skip if ConfigCameraBlood is disabled.

Cluster is skipped by setting ClusteredRendering.Current.UpdateCluster to false, and WindVolume is handled by skipping WindVolume.OnPreCull in InjectWindVolumeUpdateSkip.

Shadows are rendered from the point of view of the light, and output the same result for each eye. This is skipped by setting ClusteredRendering.Current.UpdateShadows to false.

When outdoors, ExteriorLight2 acts as a skylight, and unfortunately this continues to do its thing. It renders it shadow to texture, then executes a CommandBuffer that generates a blurred MipMap which is what is ultimately used to draw the shadows. This CommandBuffer is stored in ExteriorLight2.Current.m_cmd, so we remove it from the underlying ExteriorLight2.m_light in every frame, and only re-add it if we are rendering the first eye. We also swap Light.shadows between LightShadows.Soft and LightShadows.None for each eye, so prevent it from generating shadows that won't be used for anything.

This light follows the player, so there is actually a very slight difference in perspective in the resulting CascadedShadowMap, but if there is a difference in the final output image, I can't spot it: mWnMRIQ Shadows look right to me ¯_(ツ)_/¯

All of this, with the exception of the InjectWindVolumeUpdateSkip patch, is handled in VRRendering.SkipDuplicateRenderTasks().

Skip Damage Feedback

The material responsible for rendering red flashes in response to damage, deforming the image when infected, and some kind of flash, appears to take just as long to render even when none of these states are active. InjectSkipDamageResponse checks if any of the values are zero, and just skips the blit if so, for a fairly minor uplift.

Nordskog commented 10 months ago

Removed drawing deferred final pass to visible area mesh mask instead of quad that would fill the screen.
- Some masks have gaps between the hidden and visible area masks ( Quest Pro ), causing bright overlay elements that accumulate to bleed through.
Added config option to toggle use of the hidden area mask
- If you occasionally see the masked area ( Quest Pro ) you are probably better off rendering it.
Removed the Skip Damage Feedback skip; performance impact was negligible.

DSprtn commented 10 months ago

Great work!

DSprtn / GTFO_VR_Plugin