Regenerating a micromesh every frame from depth buffer data?

myaaaaaaaaa commented 3 months ago

Some quick calculations:

A fully subdivided (level 5) base triangle generates 1024 microtriangles
A 32 x 32 tile contains 1024 pixels (quads), or 2048 triangles
Therefore, two base triangles can fully cover a 32 x 32 tile
A 3840 x 2160 image can be broken up into 120 x 68 = 8160 such tiles
Therefore, a micromesh with only 8160 x 2 = 16320 base triangles can capture all the detail inside a 4k depth buffer

Theoretically, 16320 base triangles is few enough that it should be fast to update (or even fully rebuild) the micromesh's acceleration structure every frame (although the process of generating those base triangles from depth buffer tiles may be more involved).

I think that if this use case is officially supported, it would make for a much more compelling argument for doing raytracing in screen space. At the very least, it should be a direct improvement over raymarching, the current standard practice for screen space effects.

pixeljetstream commented 3 months ago

Interesting idea, though a few caveats:

computing mip map chain for depth buffers is really really quick (example for compute shader version https://github.com/nvpro-samples/vk_displacement_micromaps/blob/main/nvhiz-update.comp.glsl) and typically accelerates screen space ray tracing really well.
micromesh representation is slower to ray trace than traditional triangles, and the unorm11 precision wouldn't make it straight forward to actually match depth values.

myaaaaaaaaa commented 2 months ago

Thanks for the feedback!

Note that with proper raytracing, it's possible to unproject the screen space depth buffer back into a world space heightmap, and have it share a TLAS with the rest of the scene.

This would provide vastly more flexibility than raymarching, where rays must always be tested in screen space first, and then a separate ray query called if it misses the depth buffer

the unorm11 precision wouldn't make it straight forward to actually match depth values.

As an example of the extra flexibility provided, depth buffer pixels can simply be masked out rather than having to work around the existence of redundant geometry by trying to match depth values.

computing mip map chain for depth buffers is really really quick and typically accelerates screen space ray tracing really well.

micromesh representation is slower to ray trace than traditional triangles

On the other hand, the ability to trace a single TLAS for both screen space and world space geometry, as well as being able to reduce subdivision levels for low-detail regions of the depth buffer while converting to a micromesh, would provide its own performance benefits.

In practice, I suspect they would likely have different tradeoffs but end up with comparable real-world performance.

There's also something to be said for the ease of getting good real-world performance when raytracing is fully optimized by GPU vendors at the driver and hardware level, as opposed to raymarching in "software" where the burden is on the engine developers to optimize for every single target platform (on top of already being difficult to implement in the first place).

NVIDIAGameWorks / Displacement-MicroMap-Toolkit

Regenerating a micromesh every frame from depth buffer data? #9