godotengine / godot-proposals

Godot Improvement Proposals (GIPs)
MIT License
1.26k stars 101 forks source link

Improvements to multi layered rendering #12572

Open BastiaanOlij opened 1 month ago

BastiaanOlij commented 1 month ago

Describe the project you are working on

XR_VIEW_CONFIGURATION_TYPE_PRIMARY_STEREO_WITH_FOVEATED_INSET / XR_VIEW_CONFIGURATION_TYPE_PRIMARY_QUAD_VARJO support for OpenXR.

This is a good document to understand how quad rendering works.

Describe the problem or limitation you are having in your project

Godots viewport logic is currently limited to pure stereo rendering and relies fully on support for the Multiview extension. Specific to the use case here we have a need to render 4 overlapping views but with the added complexity that the outer views are at different resolution to inner views. We thus can't simply increase the number of views rendered to 4 but need a 2 + 2 approach while keeping unified culling and support buffer updates in tact.

The changes needed for this use case overlap with a number of other use cases that will benefit from these changes such as:

With some additional changes this improvement could also enable more efficient multimonitor use cases such as are common with sim racing but that falls outside of the scope of this proposal.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

In essence we'll be making two core changes to the rendering engine:

Note that the suggested changes will compliment https://github.com/godotengine/godot-proposals/issues/4932 if we decide to implement that at some point in time (I've made a start and want to find additional time to continue on this)

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

In essence we'll be making two core changes to the rendering engine:

Note that the suggested changes will compliment https://github.com/godotengine/godot-proposals/issues/4932 if we decide to implement that at some point in time (I've made a start and want to find additional time to continue on this)

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

For this we need to look at our 3 changes individually.

For our 2+2 rendering scenario this is a real world example on Varjo XR-4 hardware with quality settings set to high:

Resolution Layers Projections Description
1920 x 1904 2 0 + 1 Context view
2176 x 2512 2 2 + 3 Focus view

Projection changes

XRInterface.get_projection_for_view is consumed in RendererSceneCull::render_camera where the projections are loaded into our RendererSceneRender::CameraData object. This object is currently limited to accepting 2 projections for stereo rendering. The logic needs to be enhanced so it can take any number of projections and create a combined frustrum that encompasses all of these. We need to add checks to ensure the number of projections matches our viewport configuration and that the combined frustrum does not exceed a FOV of 179 degrees.

Viewport changes

Here we need to make a few changes.

  1. We need to properly support layers by changing RenderingServer.viewport_set_size to the following definition: viewport_set_size(viewport : RID, index : int, width : int, height : int, layers : int). We then ensure this information is correctly stored and that buffers are correctly allocated including properly handling TextureStorage.render_target_set_override (which will require similar enhancements).
  2. We add a deprecated function for viewport_set_size(viewport, width, height) that calls the new implementation with index set to 0 and layers set to 1
  3. We remove the code from RendererViewport::draw_viewports the update the viewport size and layers. Instead this implementation moves to out Viewport class. If use_xr is false we set size and layers from the properties specified (layers is added as a new property), if use_xr is true we hide these properties and obtain the information from the XRInterface
  4. If use_xr is true we also allow for the implementation of multiple layers.

There are probably some more tweaks needed around this

Rendering changes

As mention we need to enhance our implementation in RendererSceneCull::_render_scene, where we call scene_render->render_scene this needs to become a loop that loops through our sizes and then loops through the layers by our max layer size. We then call scene_render->render_scene with the proper offsets. It's likely there will be few, possibly no, changes in the render_scene implementation itself.

If this enhancement will not be used often, can it be worked around with a few lines of script?

No, this can only be solved in core.

Is there a reason why this should be core and not an add-on in the asset library?

No, this can only be solved in core.

dsnopek commented 1 month ago

Thanks!

At a high level this seems good. But there's a couple of details where I'm unsure what you have in mind.

  • The ability to specify more then 2 camera projections. Within the scope of this proposal this is a change purely on the implementation consuming XRInterface.get_projection_for_view to allow for obtaining more than 2 views.

How do the Viewport and XRInterface coordinate on how each view should be used? Would each Viewport specify which views it's meant to be rendering to?

So, like in the case of OpenXR's XR_VIEW_CONFIGURATION_TYPE_PRIMARY_STEREO_WITH_FOVEATED_INSET, would you have two Viewports with one configured to render views 0 and 1, and the other configured to render views 2 and 3?

Since XRInterface is general for all XR systems, I'd be wary of assuming that 4 views automatically means quad view based rendering as specified in OpenXR.

I'm also curious how they would coordinate on the fallback case, where multiview isn't supported. Would this be two Viewports, but both mono viewports, with one configured to render view 0 and the other view 1? As a fallback, multiple viewport is less ideal, since the developer doesn't necessarily know they need the fallback - it may only be required on some systems, but not others. For example, with WebXR, multiview may work fine in some web browsers (Meta Quest, desktop Chrome or Firefox) but not others (Pico and Vision Pro)

  • A change in our RendererSceneCull implementation where we call RendererSceneRender.render_scene, instead of calling this once, we'll loop through our camera and viewport data to call this function multiple times based on our capabilities.

So, would RendererSceneCull::_render_scene still only be run once per Viewport (assuming 2 Viewports when doing OpenXR's "stereo with foveated inset")? Even though we'd need to run it multiple times, I think we still want to run it as few times as possible

BastiaanOlij commented 1 month ago

@dsnopek no, it will be one viewport but supporting additional layers that can be at a different resolution.

Also it's always the XRInterface that steers things. The viewports responsibility is to simply render the layers it contains, the XRInterface tells it what the layers are, and what projection matrices are used.

With the fallback, still one viewport, but as we now have a loop that calls render_scene, we can make that loop either call that for each layer, or group layers of the same resolution together.

BastiaanOlij commented 1 month ago

Note that my original POC did work with two viewports, and that opened up a world of hurt because of the way our environment and world system works.

dsnopek commented 1 month ago

Ah, ok, thanks!

Also it's always the XRInterface that steers things. The viewports responsibility is to simply render the layers it contains, the XRInterface tells it what the layers are, and what projection matrices are used.

How does it drive it?

We'd need to support at least these layer configurations: "2" (multiview like we have now), "1+1" (for the non-multiview fallback) and "2+2" (for the foveated inset). And in the "+" cases, we'd need to support different resolutions.

(If we ever wanted to support XR_MSFT_first_person_observer, that could maybe be done as a "2+1"?)

Are you imagining new virtual methods on XRInterface that will return additional information about what layers and resolutions that it wants? Or, maybe it'll be passed the viewport RID and make modifications to it directly?

I'm sure this would be clear on seeing the code, but it's hard for me to picture all the details from the description :-)

BastiaanOlij commented 1 month ago

Yes, likely there will need to be enhancements to XRInterface on obtaining viewport data, and passing through views differently. Thats one of the reasons I want to remove some of the logic from the rendering server and move it to Viewport itself. But I can't make those changes until after I make these changes, they'll be a follow up from this as I pick up https://github.com/godotengine/godot/pull/81505 which this is a prelude to.

First person observer likely would need to be a separate camera though, I'm not entirely sure. Though I don't think that anyone really supports it anymore with WMR being removed from Windows.

dsnopek commented 1 month ago

Yes, likely there will need to be enhancements to XRInterface on obtaining viewport data, and passing through views differently. Thats one of the reasons I want to remove some of the logic from the rendering server and move it to Viewport itself. But I can't make those changes until after I make these changes

Ok! I think understanding the XRInterface changes would help me to understand these changes in context, but if you don't want to address that here, that's fine.

First person observer likely would need to be a separate camera though, I'm not entirely sure. Though I don't think that anyone really supports it anymore with WMR being removed from Windows.

The spec says the 3rd view can share culling and render pass with the primary views, so it does seem like it might be a candidate for an approach like this:

Image

But, yeah, I don't know that this is an extension we'd ever want to implement, just using it as an example of another possible layer setup that could exist