GPUOpen-LibrariesAndSDKs / Capsaicin

AMD ARR team rendering framework
MIT License
365 stars 33 forks source link

added `probe_spawn_tile_count_buffer_` & removed CompactScreenProbes … #7

Closed GuDuJian-J-Zhang closed 1 year ago

GuDuJian-J-Zhang commented 1 year ago

added probe_spawn_tile_count_buffer_ to hold spawned tile count

and mark as overrideable at the end of the SpawnScreenProbes pass

so that we can remove CompactScreenProbes , which is , from my understand, not necessary for the gi pipeline

pipeline screenshot before and after this change:

image

585ffb3c5f7f40098638c6b6d5d798a0

gboisse commented 1 year ago

Thanks for the PR!

What you've implemented is essentially compaction using an atomic add operation, which is totally valid here. What we have instead is a "stable" compaction (i.e., preserves the ordering of the input elements) using a regular sum scan (also known as prefix sum).

I expect both approaches are valid and performance should be similar (seems to be confirmed with the performance numbers visible on your screenshots) so feel free to use this approach although we probably won't merge the changes as it is functionally equivalent to what's already in place.

GuDuJian-J-Zhang commented 1 year ago

ok, got it. thanks for your reply