KhronosGroup / Vulkan-Docs

The Vulkan API Specification and related tools
Other
2.75k stars 462 forks source link

`VK_NV_ray_time_interval`/`VK_KHR_ray_time_interval` or something similar for Motion Blur / Time Interval Tracing #1429

Closed devshgraphicsprogramming closed 3 years ago

devshgraphicsprogramming commented 3 years ago

OptiX and Radeon Rays have supported motion blur (putting transformation keyframes in the Acceleration Structures and time variables on rays) for the past 3-4 years.

The only way to achieve this in Vulkan RT is to do that awful hack that one needs to render hair primitives, which is compute an OBB for each triangle for the time interval (for hair its just the OBB of the hair primitive), always use a custom intersection shader and do the work that the hardware units could have done (plus context switch), then make every triangle into an instance and abuse the TLAS (which also kills rebuild times because Vulkan only supports 2-level AS).

Given that Ampere has hardware acceleration for this, it would be nice if I could get some feature parity with Radeon Rays and OptiX.

AndrzejEndrju commented 3 years ago

true

dgkoch commented 3 years ago

It's definitely something we have to our future features roadmap.

Degerz commented 3 years ago

This issue can be closed now with the latest update to the Vulkan specification.

devshgraphicsprogramming commented 3 years ago

glorious

natevm commented 1 year ago

I would argue this issue should be reopened until Vulkan ray tracing adds support for motion blurred AABBs. Currently, only motion blurred triangles are supported.

It is still impractical (impossible?) to handle motion blurred curves in Vulkan like you can in OptiX or Radeon Rays.

JoshuaBarczak-Intel commented 1 year ago

In the short term, motion blurred AABBs can be emulated with intersection shaders (which can read the ray time). Is this sufficient?

natevm commented 1 year ago

In the short term, motion blurred AABBs can be emulated with intersection shaders (which can read the ray time). Is this sufficient?

No, they cannot be emulated like you say, at least not without detrimental performance losses.

If I have an object composed of AABBs (maybe a character with fur) that at time = 0 is beyond the left of the screen, and at t1 moves to beyond the right of the screen, every ray will intersect every box of that object, resulting in an exhaustive linear search at the leaves and zero benefit from hardware ray tracing. Nearly all boxes will be hit by any view-aligned ray because the AABB of every motion blurred procedural covers the entire screen, and yet rays will hit few if any primitives in these boxes due to the fast movement. Cases of bounding box overlap over time like this are very common, especially when large motion is present.

This is why OptiX and NVIDIA GPUs support AABB interpolation in hardware. The internal nodes of the tree as well as the leaves of the tree must be interpolated by the hardware to the current ray time to achieve acceptable culling performance.

We also have some research demonstrating that hardware culling with AABBs is very useful for using RT cores for range queries (with each ray picking a unique "time" and having boxes expand to achieve an adjustable range query per ray). This is very useful for scientific visualization using ray tracing, but at the moment we are forced to use OptiX because Vulkan lacks motion blurred bounding boxes.

dgkoch commented 1 year ago

Thank you for the feedback. We'll add this on the feature list for potential inclusion in a future multi-vendor extension if/when that happens.

natevm commented 1 year ago

Thank you for the feedback. We'll add this on the feature list for potential inclusion in a future multi-vendor extension if/when that happens.

Just to be a bit more clear, this idea is why I'd like motion blurred bounding boxes to be added to Vulkan. Motion blurred boxes can be used for nearest neighbor queries, which would be useful for things like signed distance fields, photon mapping, potentially collision detection:

https://patents.google.com/patent/US20230206378A1/en

devshgraphicsprogramming commented 1 year ago

Well if you've just filed a patent in 2021, its unfortunate for the adoption of such methods

Linearly Interpolating AABBs makes little sense unlike Vertices in the Nvidia Motion Blur Acceleration Structure extension, because AABBs are used to bound complex non linear geometry (unlike triangles which are 100% linear) so whatever undelying animation you might have there (like hair being represented by Beziers) will not be correctly bounded at time interval 0<t<1 99% of the time.

It kinda makes more sense to use intersection shaders or multiple BLASes & TLASes (think of them like "keyframes" in an animation) in the case you've described, remember that motion-blur does not change the topolgy of the BLAS so large transformations/movements are likely to have the same performance detrimental effect there as updating instead of rebuilding.

Triangles are different because doing ray-triangle intersection "post time interpolation" is pretty much required to achieve anything, alos because triangles are linear, you can probably interpolate their aggreate AABBs (even higher in the hierarchy) linearly too.

natevm commented 1 year ago

(sorry for all the edits, still kinda processing…)

@devshgraphicsprogramming just because this one idea was patented doesn’t mean that motion blurred bounding boxes have no other general purpose applications…

What makes methods like these hard to adopt is the inability to use the hardware in Vulkan, and has little to nothing to do with the patent. I’d also argue locking such ideas down to just NVIDIA’s OptiX platform by not adding MB AABBs is bad for the open source ray tracing ecosystem. There is a reason why Nvidia scientists went through the trouble of making the MB AABB hardware in the first place. I’m also a bit frustrated, because adding this feature would be trivially simple with MB triangles already in place.

The topology of the tree refers to what nodes parent what other nodes. Also remember that parent/internal bounding boxes also update with the motion blur’s refitting operation. So long as movement is relatively coherent, and the child nodes of a parent move roughly the same direction, then the parent box after refitting will stay useful with respect to culling. And with motion blurred boxes, this is almost always the case. (Eg, air and wind turbulence tends to be locally uniform, so curves blow roughly but not exactly the same direction. ) Using the intersection program to handle this motion is a bad idea, because even with coherent linear motion, if that motion is large, you end up with large and overlapping bounding boxes both at the leaves and internally in the tree as these curves sweep over a linear motion over time. With MB AABBs and roughly linear/coherent motion (the most common case), both internal and leaf boxes stay tight for a fixed point in time, and do not overlap, resulting in way fewer software intersection tests and less ray traversal.

Re BLAS and TLASes, yes, I was giving an example to give some intuition. I also couldn’t talk about this nearest neighbors query at the time.

The big advantage I have with MB AABBs is in non-curve-based applications. For nearest neighbor queries, we do a breadth first search over time, with boxes at t0 tightly fitting the primitive, and at t1 expanding around the primitive to the full search range. Ray time is used to exploit hardware accelerated culling with motion blurred boxes by facilitating a range search to collect the K nearest neighbors and nothing more. (With a zero length ray, at t0, your ray origin will hit no boxes, meaning none in range. At t1, the origin of the ray will hit all boxes, as every primitive is in range at t1) But that idea doesn’t work if the leaves of the tree all extend to the full t0->t1 range. If the leaves extend to the full range as you propose, then the “range” would be the entire scene, and all boxes would overlap. Indeed, this is what the fast radius search paper by Evangelou et al does, and they suffer from poor culling and exhaustive traversal for large ranges as a result.

Ultimately, the assumption that curves are the only user geometry to be used is just wrong. MB AABBs have clear advantages for curves, but in general, we use AABBs for finite elements, for voxel volumes, for particle volumes, double precision triangles, and many many more things than curves…

As I said before, there are many general purpose applications of motion blurred AABBs for even linear primitives (points, edges, and even triangles). Outside of the patent, we use motion blurred boxes to enable per-ray range queries with a customizable range, using time for the range parameter. Eg, with this MB AABB extension I could do the broad phase of a rigid body physics engine in a hardware accelerated approach. But I’d need to have per-query customizable ranges, and the only way to get that on a per-primitive basis is with MB AABBs… I’ve thought about instead using MB instances, but this ends up being memory prohibitive, and requires extensive updates to the shader binding table every frame…

natevm commented 1 year ago

I recently have another paper accepted to IEEE VIS 2023, where we did some experimenting with how ray tracing cores handle divergent tasks when used to render particle volumes, using this exact range query idea.

This philosophy of locking divergent work to SIMT cores resulted in AMD’s RT platform falling behind both NVIDIA and Intel ARC when comparing a full software traversal to a hardware assisted traversal (speedup meaning, X times more than the original performance. I have the raw ms/frame numbers too, but TLDR is that AMD is slowest due to SIMT based software traversal, and Intel ARC and NVIDIA are about neck and neck due to MIMT rt core traversal over the same tree).

0D8442F3-DE33-472E-9308-5A897FD40065 6988842C-0FDC-4E6D-9C66-13FFC39F4B72

Note, this method is not patented. This is also using RT core range queries, and it would be very convenient to not have to rebuild the entire tree when the particle smoothing radius changes. But Vulkan lacking somewhat basic functionality like motion blurred bounding boxes makes implementing techniques like this pointlessly more difficult, based on flawed assumptions of how motion blurred boxes actually work...

devshgraphicsprogramming commented 1 year ago

(sorry for all the edits, still kinda processing…)

@devshgraphicsprogramming just because this one idea was patented doesn’t mean that motion blurred bounding boxes have no other general purpose applications…

What makes methods like these hard to adopt is the inability to use the hardware in Vulkan, and has little to nothing to do with the patent. I’d also argue locking such ideas down to just NVIDIA’s OptiX platform by not adding MB AABBs is bad for the open source ray tracing ecosystem. There is a reason why Nvidia scientists went through the trouble of making the MB AABB hardware in the first place. I’m also a bit frustrated, because adding this feature would be trivially simple with MB triangles already in place.

The topology of the tree refers to what nodes parent what other nodes. Also remember that parent/internal bounding boxes also update with the motion blur’s refitting operation. So long as movement is relatively coherent, and the child nodes of a parent move roughly the same direction, then the parent box after refitting will stay useful with respect to culling. And with motion blurred boxes, this is almost always the case. (Eg, air and wind turbulence tends to be locally uniform, so curves blow roughly but not exactly the same direction. ) Using the intersection program to handle this motion is a bad idea, because even with coherent linear motion, if that motion is large, you end up with large and overlapping bounding boxes both at the leaves and internally in the tree as these curves sweep over a linear motion over time. With MB AABBs and roughly linear/coherent motion (the most common case), both internal and leaf boxes stay tight for a fixed point in time, and do not overlap, resulting in way fewer software intersection tests and less ray traversal.

Re BLAS and TLASes, yes, I was giving an example to give some intuition. I also couldn’t talk about this nearest neighbors query at the time.

The big advantage I have with MB AABBs is in non-curve-based applications. For nearest neighbor queries, we do a breadth first search over time, with boxes at t0 tightly fitting the primitive, and at t1 expanding around the primitive to the full search range. Ray time is used to exploit hardware accelerated culling with motion blurred boxes by facilitating a range search to collect the K nearest neighbors and nothing more. (With a zero length ray, at t0, your ray origin will hit no boxes, meaning none in range. At t1, the origin of the ray will hit all boxes, as every primitive is in range at t1) But that idea doesn’t work if the leaves of the tree all extend to the full t0->t1 range. If the leaves extend to the full range as you propose, then the “range” would be the entire scene, and all boxes would overlap. Indeed, this is what the fast radius search paper by Evangelou et al does, and they suffer from poor culling and exhaustive traversal for large ranges as a result.

Ultimately, the assumption that curves are the only user geometry to be used is just wrong. MB AABBs have clear advantages for curves, but in general, we use AABBs for finite elements, for voxel volumes, for particle volumes, double precision triangles, and many many more things than curves…

As I said before, there are many general purpose applications of motion blurred AABBs for even linear primitives (points, edges, and even triangles). Outside of the patent, we use motion blurred boxes to enable per-ray range queries with a customizable range, using time for the range parameter. Eg, with this MB AABB extension I could do the broad phase of a rigid body physics engine in a hardware accelerated approach. But I’d need to have per-query customizable ranges, and the only way to get that on a per-primitive basis is with MB AABBs… I’ve thought about instead using MB instances, but this ends up being memory prohibitive, and requires extensive updates to the shader binding table every frame…

btw whats stopping you from using a triangle instead of an AABB in Vulkan as the primitive used to fit a point and reusing the method of casting ray-time to search radius?

natevm commented 1 year ago

btw whats stopping you from using a triangle instead of an AABB in Vulkan as the primitive used to fit a point and reusing the method of casting ray-time to search radius?

I did think about using triangles before, but unfortunately triangles can only be used to model the shell surface of a spherical range, and not the actual volumetric ball itself. For photon mapping, you often have millions of points, so modeling each point as a triangulated sphere or even as an instance of one takes too much memory.

RT cores can be used for point containment queries rather than just visibility queries by setting the tmin and tmax values of the ray to 0. Triangles cannot be intersected by a point, so we instead use axis aligned bounding boxes, where the ray origin reports an intersection when it’s contained by an AABB. For range queries, you want to know what points are within range of the query origin. Range queries can be transformed into a point location problem by surrounding every data point with a radius range of influence, which can then be hardware accelerated.

For what it’s worth, here are some relevant papers using this technique:

https://jcgt.org/published/0010/01/02/paper-lowres.pdf

https://arxiv.org/abs/2008.11235

I want to be able to control this range on a per-query basis, which requires motion blurred bounding boxes. One of my applications is hardware accelerated rigid body collision detection, which requires a unique range per rigid body pair, but the details require an understanding of the GJK algorithm which might take a while to explain here.

devshgraphicsprogramming commented 1 year ago

hmm anyway seems that the extension is VK_NV_raytracing_motionblur, and nobody else supports anything similar.

So I guess the right people to ask for a VK_NV_aabb_motionblur would be your Nvidia colleagues @natevm