gkjohnson / three-gpu-pathtracer

Path tracing renderer and utilities for three.js built on top of three-mesh-bvh.
https://gkjohnson.github.io/three-gpu-pathtracer/example/bundle/index.html
MIT License
1.34k stars 133 forks source link

Support for WebGPU #547

Open jellychen opened 7 months ago

jellychen commented 7 months ago

By testing normals, the compute shaders of WebGPU are found to be about 20 times faster than WebGL's GPGPU, so rendering with WebGPU will be much quicker.

I discovered the following project: https://lgltracer.com/editor/index.html. Its rendering speed is a bit faster than that of three-gpu-pathtracer, but it's not open-source, which makes it more difficult for us to make custom modifications when needed.

Therefore, I think the support for WebGPU will have significant value

So, is there any consideration for three-gpu-pathtracer to support WebGPU?

jellychen commented 7 months ago

I am considering working based on the three-mesh-bvh project, but the data format and layout of three-mesh-bvh are not well documented. Could more detailed documentation be provided for this aspect?

Thank you for this project. @gkjohnson

gkjohnson commented 7 months ago

Hello! Yes a compute shader is much more well-suited for this kind of rendering. If you'd like to help make a WebGPU compute shader version of the path tracer in this project that would be great and I can help that along, as well. As you've mentioned adding support for raytracing against the three-mesh-bvh structure in a compute shader would be the first step.

It would be best to create an issue in the three-mesh-bvh repo for that change so we can track progress there, but here's a breakdown of the flattened bvh data. One node in the BVH structure is stored using 32 bytes with the following layout:

Common Data Bytes Name Data
3 floats (12 bytes) bounds min The vector representing the minimum values for the bounding box
3 floats (12 bytes) bounds max The vector representing the maximum values for the bounding box

If the node is a parent node then the following 8 bytes contain the following. The left child node is assumed to be the child immediately after the parent node:

Bytes Name Data
UInt32 (4 bytes) right child index The node index of the right child node index
UInt32 (4 bytes) split axis The split direction of the parent node on the x, y, or z axis (0, 1, or 2)

If the node is a leaf (ie it directly contains the triangles) then the following four bytes contain the following. The "leaf node" flag must be checked first to determine what kind of node is encountered:

Bytes Name Data
UInt32 (4 bytes) offset The offset index into index or indirect buffer for triangles stored in this node
UInt16 (2 bytes) count The number of count of triangles stored in this node
UInt16 (2 bytes) is leaf node These bits are all set to 1 (or 0xFFFF) to indicate that this is a leaf node
jellychen commented 7 months ago

I studied the following parts:

1.bvh request for intercourse

  1. Monte Carlo
  2. Multiple importance sampling 4.brdf
  3. Post-processing

What other parts need to be added, the workload is not small.

gkjohnson commented 7 months ago

It's not a small amount of work but I think taking this one step at a time is best and perhaps others will start helping, as well. The first step will be getting raytracing working on WebGPU compute shaders in a simple case. Something like this three-mesh-bvh normals path tracing example or this lambert one but with compute shaders. Once that's prepared we can start looking into adding a new WebGPU path tracer in this project and work up to all the same features in the WebGL version.

If you're looking for more references on path tracing in general I can recommend Peter Shirley's Raytracing in One Weekend online book series. And the PBR Book for more advanced concepts.

jellychen commented 7 months ago

It seems like there are some issues with the translation. Regarding the data structure, it should probably be like this:

Bytes Name Data
UInt32 (4 bytes) offset The offset index into index or indirect buffer for triangles stored in this node
UInt16 (2 bytes) is leaf node These bits are all set to 1 (or 0xFFFF) to indicate that this is a leaf node
UInt16 (2 bytes) count The number of count of triangles stored in this node

f a geometric body is non-indexed, then the "Offset" would refer to?

gkjohnson commented 7 months ago

Regarding the data structure, it should probably be like this:

The data order stored in the CPU array buffer is offset, count, leaf node. You can see that data set here:

const offset = node.offset;
const count = node.count;
uint32Array[ stride4Offset + 6 ] = offset;
uint16Array[ stride2Offset + 14 ] = count;
uint16Array[ stride2Offset + 15 ] = IS_LEAFNODE_FLAG;
return byteOffset + BYTES_PER_NODE;

This is the representation of the BVH used for storing an traversing the tree on the CPU.

When the data is packed into a texture it is stored as you have laid it out - in offset, leaf node, count. You can see that here:

const count = COUNT( nodeIndex16, uint16Array );
const offset = OFFSET( nodeIndex32, uint32Array );

const mergedLeafCount = 0xffff0000 | count;
contentsArray[ i * 2 + 0 ] = mergedLeafCount;
contentsArray[ i * 2 + 1 ] = offset;

This is done because we cannot read Uint32 and Uint16s directly from the same texture so the leaf node and count values are manually packed into a single UInt32. "count" is stored in the later two bytes so no extra bit shifting is required to get the count value.

If a geometric body is non-indexed, then the "Offset" would refer to?

There are two scenarios. The first is if a BVH is generated for geometry that has no index then one is automatically generated so it can be rearranged. Second, if the indirect option is set then no index is generated. Instead a different buffer with triangles is generated so that is sorted. The indirect option works for both indexed and non-indexed geometry, though the indices must be converted if using them directly. You can read more on the indirect option in the options for MeshBVH.

When rendering this on the GPU the "indirect" buffer is converted to a regular index buffer (see here). For the initial implementation of WebGPU support I think it's okay to only handle the simple case and not worry about the "indirect" flag.

Feel free to ask other questions. I know some of these things are not obvious. If there are things you think could be more clear in the code please feel free to make a PR to add some comments, as well.

jellychen commented 7 months ago

I am considering building a two-level BVH. Is there such code for a three-mesh-BVH? Or should I build a BVH for each different mesh myself, and then construct a dedicated top-level BVH for the AABB boxes of each BVH?

gkjohnson commented 7 months ago

I am considering building a two-level BVH. Is there such code for a three-mesh-BVH?

I assume you mean something like a scene-level BVH or a top level acceleration structure (TLAS)? This is not currently available in three-mesh-bvh but it would also be a good a good addition to the project, though. Long term I wanted to create an abstraction of the MeshBVH class that would enable a BVH of a scene objects to be queried (see https://github.com/gkjohnson/three-mesh-bvh/issues/388) but a simple implementation will be good for now.

I just ask that the first contributions to three-mesh-bvh be reasonably sized - so focusing on a single feature first would be best.

apurvazaveri commented 4 months ago

just curious is there any update on this?