lighttransport / nanort

NanoRT, single header only modern ray tracing kernel.
MIT License
1.07k stars 89 forks source link

Fix the binning algorithm used in nanort #55

Closed madmann91 closed 1 year ago

madmann91 commented 4 years ago

The binning algorithm used in the current version of the library is not nearly as precise as it could be. Right now, it does evaluate the SAH by just cutting the current node bounding box at regular positions. This behavior is suboptimal, because the bounding boxes of the left and right child might actually be smaller. This pull requests implements binning by storing a bounding box in each bin, and by choosing the partition by sweeping the bins to find the best cost (as described in On fast Construction of SAH-based Bounding Volume Hierarchies).

The impact of this change on trace times is largely beneficial (up to 1.93x faster in my test scenes), but at the cost of an increase in build times (but please check that on your end, so that you do not merge this if there is a corner case that does not benefit from this change). I believe this increase in build times can be more than mitigated by precomputing the primitive centers (which are now also necessary for binning) and bounding boxes before construction. Right now, the centers and bounding boxes are always recomputed, which results in many incoherent memory accesses and redundant computations.

syoyo commented 4 years ago

@madmann91 Super awesome!

We'll test this change and how it goes well. At least in most case the change will make traversal speed faster, since current implementation of SAH BVH binning algorithm is rather old and not good as you pointed out.

syoyo commented 4 years ago

We've tested the change and got 5~10% improvement(both for BVH build and intersection test) for Digital Emily model(260K tris) on TR 1950X.

We plan to merge this PR after:

madmann91 commented 4 years ago

That's pretty nice! Note that you can diminish the number of bins to 32, that'll give you a small build performance boost, at a small cost in traversal time. Another option to speed up the build times is to use the bounding box center instead of the primitive center. That has a small impact on traversal performance too, but it removes the need to recompute centers all the time. If you go that route it is important that you use the bounding box center both during binning and partitioning the primitives.

syoyo commented 1 year ago

@madmann91 Sorry for taking time to merge the PR.

I have implemented BoundingBoxAndCenter to each custom geometries(e.g. Sphere, Curves) and do some code cleaups. Now probably most of examples build finely.

Thanks for the contribution!