svenstaro / bvh

A fast BVH using SAH in rust
https://docs.rs/bvh
MIT License
225 stars 37 forks source link

add ability to incrementally add/remove nodes from bvh #99

Closed dbenson24 closed 4 months ago

dbenson24 commented 1 year ago

leaving an initial draft of this PR here. It looks like there might be a perf regression with when I originally wrote this so I'm going to take a look at that. (post optimizaiton intersection of the 120k benchmark is massively improved, intersection with sponza isn't doing as well)

dbenson24 commented 1 year ago
// PR
test bvh::optimization::bench::bench_intersect_120k_after_optimize_00p   ... bench:         853 ns/iter (+/- 15)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_01p   ... bench:         920 ns/iter (+/- 25)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_10p   ... bench:       2,607 ns/iter (+/- 488)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_50p   ... bench:       2,941 ns/iter (+/- 572)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_00p     ... bench:         853 ns/iter (+/- 21)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_01p     ... bench:         914 ns/iter (+/- 17)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_10p     ... bench:       1,745 ns/iter (+/- 243)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_50p     ... bench:       1,985 ns/iter (+/- 356)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_00p ... bench:       1,366 ns/iter (+/- 42)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_01p ... bench:       3,098 ns/iter (+/- 139)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_10p ... bench:      15,347 ns/iter (+/- 1,580)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_50p ... bench:       8,568 ns/iter (+/- 1,122)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_00p   ... bench:       1,359 ns/iter (+/- 33)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_01p   ... bench:       1,496 ns/iter (+/- 59)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_10p   ... bench:       1,906 ns/iter (+/- 105)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_50p   ... bench:       2,472 ns/iter (+/- 193)
test bvh::optimization::bench::bench_optimize_bvh_120k_00p               ... bench:   1,064,110 ns/iter (+/- 6,733)
test bvh::optimization::bench::bench_optimize_bvh_120k_01p               ... bench:   2,357,540 ns/iter (+/- 366,656)
test bvh::optimization::bench::bench_optimize_bvh_120k_10p               ... bench:  17,270,170 ns/iter (+/- 2,828,226)
test bvh::optimization::bench::bench_optimize_bvh_120k_50p               ... bench:  72,724,610 ns/iter (+/- 7,787,654)
test bvh::optimization::bench::bench_randomize_120k_50p                  ... bench:   4,599,345 ns/iter (+/- 1,906,880)
// Base
test bvh::optimization::bench::bench_intersect_120k_after_optimize_00p   ... bench:         855 ns/iter (+/- 24)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_01p   ... bench:     143,784 ns/iter (+/- 12,246)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_10p   ... bench:   1,297,513 ns/iter (+/- 622,827)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_50p   ... bench:   2,366,247 ns/iter (+/- 1,195,041)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_00p     ... bench:         851 ns/iter (+/- 15)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_01p     ... bench:         919 ns/iter (+/- 18)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_10p     ... bench:       1,772 ns/iter (+/- 251)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_50p     ... bench:       2,058 ns/iter (+/- 434)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_00p ... bench:       1,367 ns/iter (+/- 50)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_01p ... bench:       2,873 ns/iter (+/- 197)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_10p ... bench:       4,147 ns/iter (+/- 492)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_50p ... bench:       5,918 ns/iter (+/- 695)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_00p   ... bench:       1,363 ns/iter (+/- 51)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_01p   ... bench:       1,498 ns/iter (+/- 67)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_10p   ... bench:       1,898 ns/iter (+/- 133)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_50p   ... bench:       2,484 ns/iter (+/- 216)
test bvh::optimization::bench::bench_optimize_bvh_120k_00p               ... bench:     984,960 ns/iter (+/- 83,377)
test bvh::optimization::bench::bench_optimize_bvh_120k_01p               ... bench:   2,079,140 ns/iter (+/- 245,858)
test bvh::optimization::bench::bench_optimize_bvh_120k_10p               ... bench:  12,014,960 ns/iter (+/- 1,541,641)
test bvh::optimization::bench::bench_optimize_bvh_120k_50p               ... bench:  53,112,990 ns/iter (+/- 4,470,184)
test bvh::optimization::bench::bench_randomize_120k_50p                  ... bench:   4,634,985 ns/iter (+/- 2,040,404)
dbenson24 commented 1 year ago

Just gonna jot down my general arguments for this method over the current optimize.

The ability to dynamically add/remove individual nodes is a pretty major feature that you don't get from the other rust bvh (Parry's QBvh). This approach also eliminates the need to store the depth of each node so the nodes are smaller in memory (not massively smaller but it's still something). Keeping depth actually dramatically slows this approach down because it needs to update the depths during each operation.

All of this assumes that this approach also gets better/similar performance to the current one. Once I add the parallel rebuild, optimize/add/remove should be most relevant for changes of less than ~10% of the Bvh due to lack of parallelism.

There are still some optimizations left on the table. Right now add/remove doesn't leave the bvh in a contiguous state in memory. I want to test adding a relayout operation that puts all the nodes back in order. There's probably more operations

svenstaro commented 1 year ago

These are some incredible numbers! I might not get to review this but I hope @marstaik could take a look. Please write a CHANGELOG entry for this change.

svenstaro commented 1 year ago

@dbenson24 lil ping :)

svenstaro commented 10 months ago

@dbenson24 are you interested in continuing this?

dbenson24 commented 10 months ago

Hey yes sorry, recently had a baby and haven't had the time. I do want to get this cleaned up soon

svenstaro commented 10 months ago

Oh shit, congratz! Take all the time you need. :)

dbenson24 commented 9 months ago

Ok I think I've addressed everything except for the Tombstoning. It's definitely a nice to have, but I think in practicality it would rarely be used. Because of how efficiently we can traverse the bvh in parallel to do a build, analysis that goes over the whole tree to do something like reordering it would also have to be made similarly parallel. I just realized I hadn't ported the parallel stuff over to this yet so that'll be the next PR.

svenstaro commented 9 months ago

Could you also write a CHANGELOG entry for changes that people should be aware of?

dbenson24 commented 9 months ago

Can we do the changelog with my next PR and hold off on release until after that? I'll have it up shortly after this one is submitted. That will be the PR that builds the bvh in parallel

svenstaro commented 9 months ago

Oh ok, great.

svenstaro commented 9 months ago

@marstaik feel like taking another look at this one? It looks to me like all your last comments were addressed.

marstaik commented 8 months ago

@svenstaro Ill take a look soon, I was on a very long vacation :) I am back now.

svenstaro commented 7 months ago

@marstaik Friendly ping :)

svenstaro commented 7 months ago

@marstaik Happy to have you back! Should we maybe just merge this and you apply your changes on top? It might be the more efficient approach.

dbenson24 commented 7 months ago

No need, I think that is pretty much everything except for the tombstoning. I definitely don't want to do that in this PR. I want to get to the multithreading PR

svenstaro commented 7 months ago

@dbenson24 Great to see you're still around as well! :) Would be great if we could get this PR in soon.

dbenson24 commented 7 months ago

I'm pretty sure this is ready to merge

svenstaro commented 7 months ago

Some more clippy stuff to fix :)

dbenson24 commented 7 months ago

Lol these aren't even my clippy errors, clippy has added new lints that affect existing code. I have been missing them because I haven't been updating my rust nightly every time

svenstaro commented 7 months ago

Weird, did we just spot a flaky test?

svenstaro commented 6 months ago

@marstaik What do you think?

svenstaro commented 5 months ago

@marstaik friendly ping :)

svenstaro commented 4 months ago

If I hear nothing for a week, I'll just merge this.

svenstaro commented 4 months ago

Finally merging this, thanks for you work @dbenson24!

dbenson24 commented 4 months ago

I will get the parallelization PR up soon

On Sat, Feb 17, 2024, 4:56 AM Sven-Hendrik Haase @.***> wrote:

Finally merging this, thanks for you work @dbenson24 https://github.com/dbenson24!

— Reply to this email directly, view it on GitHub https://github.com/svenstaro/bvh/pull/99#issuecomment-1949920581, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACFEVAZGHGAINP2YZDIEWGLYUB5GBAVCNFSM6AAAAAAXYCHKZSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBZHEZDANJYGE . You are receiving this because you were mentioned.Message ID: @.***>