Closed toji closed 3 years ago
WebGPU-wise it looks mostly good.
imageIndex
matches the arrayLayer
concept in WebGPU, so since XRGPUSubImage
is more in the WebGPU world, maybe it would make sense to make the concept name match?
Another thing is that GPUTexture
must have a known format and usage, where is it specified for the textures in XRGPUSubImage
? The format will be important to create pipelines that render to the textures (WebGPU has a validation rule that the pipeline's color attachment format must match the render pass's color attachment format). And the usage is important if it needs to support more than just OUTPUT_ATTACHMENT
.
What's the initial content of textures in XRSubImage? Is it fair to assume they are going to start (lazy) zeroed?
imageIndex
matches thearrayLayer
concept in WebGPU, ... maybe it would make sense to make the concept name match?
I'd be fine with that! Also (heh) I just realized that if we have the opportunity to make this even simpler by providing a GPUTextureViewDescriptor
directly. The only thing it'll really have to specify is the baseArrayLayer
and arrayLayerCount
, but the fact that we can just say "This is the subresource you want" will make devs live easier while not preventing them from inspecting the values and doing their own thing if they really need to.
GPUTexture
must have a known format and usage, where is it specified for the textures inXRGPUSubImage
?
Oh! Good point! A given layer should have the same texture properties for its lifetime, so I'd imagine we'd want to just pass the format and usage in to the create____Layer
methods. I was originally thinking that usage would have to include OUTPUT_ATTACHMENT
, but then realized that in some cases it would be perfectly reasonable to only need COPY_DST
, so we're probably better off just leaving the developer to give us the full usage flags. (The user agent may need it's own usage internally, like COPY_SRC
, so we'd have to figure out how to handle that too.)
Along a similar line, I've realized that maybe we don't need the XRTextureType
here like we do in WebGL, since the textures will always have a dimensionality of 2d
, and whether or not you treat it as an array (or a cube map) happens when you create the GPUTextureView
over the top of it. In fact, given that we can guarantee texture array availability with WebGPU we could maybe also ensure that only one texture is returned per frame, either a single layer with distinct viewports (largely for pre-rendered side-by-side layout content) or a layer array. That could even allow us to start using GPUSwapChains
if we really wanted to:
// Render Loop for a projection layer with a WebGPU texture source.
const xrGpuBinding = new XRGPUBinding(xrSession, gpuDevice);
const layer = xrGpuBinding.createProjectionLayer({
colorFormat: xrGpuBinding.preferredColorFormat,
depthStencilFormat: xrGpuBinding.preferredDepthStencilFormat
}, { alpha: false });
const layerColorSwapChain = xrGpuBinding.getColorSwapChain(layer);
const layerDepthSwapChain = xrGpuBinding.getDepthSwapChain(layer);
xrSession.updateRenderState({ layers: [layer] });
xrSession.requestAnimationFrame(onXRFrame);
function onXRFrame(time, xrFrame) {
xrSession.requestAnimationFrame(onXRFrame);
const commandEncoder = device.createCommandEncoder({});
const colorTexture = layerColorSwapChain.getCurrentTexture();
const depthTexture = layerDepthSwapChain.getCurrentTexture();
for (const view in xrViewerPose.views) {
// Still have to do this to get the region of the texture to render to.
const subImage = xrGpuBinding.getViewSubImage(layer, view);
// Render to the color and depth textures
const passEncoder = commandEncoder.beginRenderPass({
colorAttachments: [{
attachment: colorTexture.createView(subImage.viewDescriptor),
loadValue: 'load',
}],
depthStencilAttachment: {
attachment: subImage.depthStencilTexture.createView(subImage.viewDescriptor),
depthLoadValue: 'load',
depthStoreOp: 'store',
stencilLoadValue: 'load',
stencilStoreOp: 'store',
}
});
let viewport = subImage.viewport;
passEncoder.setViewport(viewport.x, viewport.y, viewport.width, viewport.height, 0.0, 1.0);
// Render from the viewpoint of xrView
passEncoder.endPass();
}
device.defaultQueue.submit([commandEncoder.finish()]);
}
That's a bit awkward, as far as I'm concerned, so I'd probably avoid it unless there's a compelling current or future reason to use the GPUSwapChain
mechanism that I'm not aware of.
What's the initial content of textures in
XRSubImage
? Is it fair to assume they are going to start (lazy) zeroed?
Yeah, that would be the direction I'd like to go.
So given the above (and ignoring the potential GPUSwapChain
integration for a moment), we could update the proposed IDL to be something like this:
[Exposed=Window] interface XRGPUSubImage : XRSubImage {
[SameObject] readonly attribute GPUTexture colorTexture;
[SameObject] readonly attribute GPUTexture? depthStencilTexture;
readonly attribute GPUTextureViewDescriptor viewDescriptor;
readonly attribute unsigned long textureWidth;
readonly attribute unsigned long textureHeight;
};
dictionary XRGPULayerTextureDescriptor {
required GPUTextureFormat colorFormat;
GPUTextureFormat? depthStencilFormat;
GPUTextureUsageFlags usage = 0x10; // GPUTextureUsage.OUTPUT_ATTACHMENT
};
[Exposed=Window] interface XRGPUBinding {
constructor(XRSession session, GPUDevice device);
readonly attribute double nativeProjectionScaleFactor;
readonly attribute GPUTextureFormat preferredColorFormat;
readonly attribute GPUTextureFormat preferredDepthStencilFormat;
XRProjectionLayer createProjectionLayer(XRGPULayerTextureDescriptor descriptor,
optional XRProjectionLayerInit init);
XRQuadLayer createQuadLayer(XRGPULayerTextureDescriptor descriptor,
optional XRQuadLayerInit init);
XRCylinderLayer createCylinderLayer(XRGPULayerTextureDescriptor descriptor,
optional XRCylinderLayerInit init);
XREquirectLayer createEquirectLayer(XRGPULayerTextureDescriptor descriptor,
optional XREquirectLayerInit init);
XRCubeLayer createCubeLayer(XRGPULayerTextureDescriptor descriptor,
optional XRCubeLayerInit init);
XRGPUSubImage getSubImage(XRCompositionLayer layer, XRFrame frame, optional XREye eye = "none");
XRGPUSubImage getViewSubImage(XRProjectionLayer layer, XRView view);
};
Another thing is that
GPUTexture
must have a known format and usage, where is it specified for the textures inXRGPUSubImage
? The format will be important to create pipelines that render to the textures (WebGPU has a validation rule that the pipeline's color attachment format must match the render pass's color attachment format).
Is it necessary to allow the author to create any type of format for the swapchain? If we allow this, should there also be a feature to query which formats are supported?
const gpuAdapter = await navigator.gpu.getAdapter({xrCompatible: true}); const gpuDevice = await gpuAdapter.requestDevice(); const xrGpuBinding = new XRGPUBinding(xrSession, gpuDevice);
Could this all be collapsed into a single call?
I think this proposal looks very reasonable! If it's accepted, should we merge it into current layers spec?
In the latest proposal, the viewDescriptor
seems to be for both the color and the depth. Are there cases where the color and the depth would have different descriptors?
The XRGPULayerTextureDescriptor
is nice, but are there any constraints on the format and usages that can be used with the platform APIs? If the platform APIs are very strict, maybe there could be a preferred format (and usage?) exposed a bit like GPUCanvasContext.getPreferredFormat
. (or the XRGPULayer could tell the application which format it wants it to use).
Is it necessary to allow the author to create any type of format for the swapchain? If we allow this, should there also be a feature to query which formats are supported?
I do have an attribute to get the preferred format, but if we allow developers to specify any format we'll probably need an xrEnumerateSwapchainFormats
equivalent.
Could this all be collapsed into a single call?
Not clear on how, or why that would be desirable. (Please note the exact WebGPU initialization sequence is still undergoing some discussion.)
If it's accepted, should we merge it into current layers spec?
Given that WebGPU still isn't shipped I'd be hesitant to make it a dependency of the base layers API. I think they can stay separate for now.
In the latest proposal, the
viewDescriptor
seems to be for both the color and the depth. Are there cases where the color and the depth would have different descriptors?
No, that shouldn't occur in this context. Given that if you are requesting a depth texture this way then it's allowed to be used in the compositing you'll never have anything but a 1:1 relationship between color and depth sub resources.
The
XRGPULayerTextureDescriptor
is nice, but are there any constraints on the format and usages that can be used with the platform APIs?
There are some limits, as Rik mentioned. (I should have researched a bit more before updating my proposal.) I do have preferredColor/DepthStencilFormat
attributes, but it seems like we'll need a bit more than that in the end. Probably a way to enumerate the supported formats ordered by preference. We could always just let the UA pick the format, the way we do with WebGL, but I think we want to embrace the increased flexibility of WebGPU where we can.
Final thought: Just realized that currently the layer init indicates things like whether or not you want alpha or depth buffers, but in this environment that would be implicit in the formats you provide, so we'll want to re-structure that.
I do have an attribute to get the preferred format, but if we allow developers to specify any format we'll probably need an xrEnumerateSwapchainFormats equivalent.
Ah yes, I missed it. The idea for the GPUSwapchain
is that there will be a small list of allowed formats in the specification in addition to the preferred format, but they might cause an extra conversion copy. (currently it's only bgra8-unorm
).
Probably a way to enumerate the supported formats ordered by preference.
Or just preferred + a fixe allow-list in the spec. Maybe the usage could allow just OUTPUT_ATTACHMENT
to start, and see if we need to add COPY_DST
later (we'll need to look whether platform APIs allow it).
I've taken the feedback from this thread so far and updated the explainer text I posted above, which I've now pushed to https://github.com/toji/webxr-webgpu-binding/blob/main/explainer.md for the purposes of previewing and discussion.
/agenda to ask about creating an official Immersive Web repo for the feature. Discussion here seems positive and I doubt anyone is against seeing this integration happen at some point.
Why not skip the GPUTexture
completely, like:
[SameObject] readonly attribute GPUTextureView colorView;
[SameObject] readonly attribute GPUTextureView? depthStencilView;
This way you don't need a descriptor, and you don't need that awkward view creation on every frame.
This would completely prevent using the textures as COPY_DST
, maybe that's fine given it is a rare usecase and giving the views directly would be a good usability improvement.
Why not skip the
GPUTexture
completely, like:[SameObject] readonly attribute GPUTextureView colorView; [SameObject] readonly attribute GPUTextureView? depthStencilView;
Would that work with multiview?
COPY_DST
usage seems like it would be desirable for a lot of non-projection layer types, which will frequently be populated directly from an ImageBitmap
or similar. (In fact, we may even want to make COPY_DST
part of the default usage for those layer types... hm.)
I would expect in XR/VR to see all the work happening inside a single render pass (or one pass per eye, at least). Anything that you'd need to copy to screen would be drawn as quads, so that render pass is not disrupted, and mobile GPUs can do their tiling efficiently.
That's true of anything rendered into what we call "projection layers", which is what's used to render your typical immersive content. Where COPY_DST
usage comes in is Quad/Cylinder/Equirect/Cube layers, which are frequently updated just once or very infrequently and positioning is handled by the XR compositor after that. For example: loading an Equirect as a skybox. It'll be a pretty natural code path to upload that directly from an Image tag/ImageBitmap into the layer texture and then never touch it again.
this is moving to https://github.com/immersive-web/WebXR-WebGPU-Binding/ repository? (housekeeping)
Yes
As WebGPU gets closer to a shippable state, I think it's time we begin looking seriously at what the WebXR/WebGPU interface should be. For anyone that's been following the Layers module work it should be unsurprising that my proposal is to build on those mechanisms with a proposed
XRGPUBinding
interface that mirrors the existingXRWebGLBinding
.I don't think anything here is too controversial, but I wanted to put this up in proposals prior to requesting a repo for it to get some preliminary feedback. There are a couple of things worth pointing out:
XRGPU___
seems a bit weird, given that the WebGL equivalent isXRWebGL____
. But WebGPU's naming convention isGPU____
rather thanWebGPU___
and so I wanted to try and stick with replicating that.GPUSwapChain
that is relatively similar to the pattern that we want here, so it's tempting to use that. The reason I didn't initially is because it doesn't offer a way to specify arguments likeXRView
orXRFrame
, doesn't allow for multiple textures (whereas we want to support both a color and depth/stencil) and doesn't allow the texture dimensions or viewport to be reported (not a problem for regular WebGPU usage, because those values come from the canvas.)Below is a first pass at explainer text for the proposed module, which was relatively simple to produce given that it borrows so much from the Layers explainer.
WebXR/WebGPU binding
WebXR is well understood to be a demanding API in terms of graphics rendering performance, a task that has previously fallen entirely to WebGL. The WebGL API, while capable, is based on the relatively outdated native APIs which have recently been overtaken by more modern equivalents. As a result, it can sometimes be a struggle to implement various recommended XR rendering techniques in a performant way.
The WebGPU API is an upcoming API for utilizing the graphics and compute capabilities of a device's GPU more efficiently than WebGL allows, with an API that better matches both GPU hardware architecture and the modern native APIs that interface with them, such as Vulkan, Direct3D 12, and Metal. As it offers the potential to enable developers to get significantly better performance in their WebXR applications.
This module aims to allow the existing WebXR Layers module to interface with WebGPU by providing WebGPU swap chains for each layer type.
WebGPU binding
As with the exisitng WebGL path described in the Layers module, all WebGPU resources required by WebXR would be supplied by an
XRGPUBinding
instance, created with anXRSession
andGPUDevice
like so:Note that the
GPUAdapter
must be requested with thexrCompatible
option set totrue
. This mirrors the WebGL context creation arg by the same name, and ensures that the returned adapter will be one that is compatible with the UAs selected XR Device.Once the
XRGPUBinding
instance has been created, it can be used to create the variousXRCompositorLayer
s, just likeXRWebGLBinding
:This allocates a layer that supplies a
2d-array
GPUTexture
as it's output surface.As with the base XR Layers module,
XRGPUBinding
is only required to supportXRProjectionLayer
s unless thelayers
feature descriptor is supplied at session creation and supported by the UA/device. If thelayers
feature descriptor is requested and supported, however, all otherXRCompositionLayer
types must be supported. Layers are still set viaXRSession
'supdateRenderState
method, as usual:Rendering
During
XRFrame
processing each layer can be updated with new imagery. CallinggetViewSubImage()
with a view from theXRFrame
will return anXRGPUSubImage
indicating the textures to use as the render target and what portion of the texture will be presented to theXRView
's associated physical display.WebGPU layers allocated with the
'texture'
type will provide sub images with aviewport
and animageIndex
of0
for eachXRView
. Note that thecolorTexture
anddepthStencilTexture
can be different between the views.WebGPU layers allocated with the
'texture-array'
type will provide sub images with the sameviewport
and a uniqueimageIndex
indicating the texture layer to render to for eachXRView
. Note that thecolorTexture
anddepthStencilTexture
are the same between views, just theimageIndex
is different.Non-projection layers, such as
XRQuadLayer
, may only have 1 sub image for'mono'
layers and 2 sub images for'stereo'
layers, which may not align exactly with the number ofXRView
s reported by the device. To avoid rendering the same view multiple times in these scenarios Non-projection layers must use theXRGPUBinding
'sgetSubImage()
method to get theXRSubImage
to render to.For mono textures the
XRSubImage
can be queried using just the layer andXRFrame
:For stereo textures the target
XREye
must be given togetSubImage()
as well:Proposed IDL