trixi-framework / PointNeighbors.jl

PointNeighbors.jl: Neighborhood search with fixed search radius in Julia
https://trixi-framework.github.io/PointNeighbors.jl/
MIT License
18 stars 5 forks source link

Improve GPU performance by changing the offset computation of the `FullGridCellList` #50

Closed efaulhaber closed 3 months ago

efaulhaber commented 3 months ago

Based on #49.

I basically changed what the commit says: "Subtract min_corner in cell_coords instead of subtracting min_cell in getindex".

Previously, cell_coords returned the absolute cell coordinates, then the neighbor cells around these were computed, and for each of those neighbor cells, the offset min_cell had to be subtracted. Now, I subtract the min_corner in cell_coordinates, so that we don't have to do this for every neighboring cell.

On the CPU, the difference is small:

julia> plot_benchmarks(benchmark_wcsph, (118, 118, 118), 3)
new version
with 118x118x118 = 1643032 particles finished in 243.722 ms

first version
with 118x118x118 = 1643032 particles finished in 262.962 ms

new version
with 187x187x187 = 6539203 particles finished in 1.003 s

first version
with 187x187x187 = 6539203 particles finished in 1.062 s

new version
with 297x297x297 = 26198073 particles finished in 4.070 s

first version
with 297x297x297 = 26198073 particles finished in 4.284 s

julia> plot_benchmarks(benchmark_n_body, (118, 118, 118), 3)
new version
with 118x118x118 = 1643032 particles finished in 103.480 ms

first version
with 118x118x118 = 1643032 particles finished in 116.125 ms

new version
with 187x187x187 = 6539203 particles finished in 424.271 ms

first version
with 187x187x187 = 6539203 particles finished in 471.688 ms

new version
with 297x297x297 = 26198073 particles finished in 1.709 s

first version
with 297x297x297 = 26198073 particles finished in 1.897 s

On the GPU (RTX 3090), however, the difference is huge. Even for the real-life WCSPH benchmark:

julia> plot_benchmarks(benchmark_wcsph_gpu, (118, 118, 118), 3)
new version
with 118x118x118 = 1643032 particles finished in 252.205 ms

first version
with 118x118x118 = 1643032 particles finished in 405.049 ms

new version
with 187x187x187 = 6539203 particles finished in 1.003 s

first version
with 187x187x187 = 6539203 particles finished in 1.643 s

new version
with 297x297x297 = 26198073 particles finished in 3.972 s

first version
with 297x297x297 = 26198073 particles finished in 6.602 s

julia> plot_benchmarks(benchmark_n_body_gpu, (118, 118, 118), 3)
new version
with 118x118x118 = 1643032 particles finished in 85.513 ms

first version
with 118x118x118 = 1643032 particles finished in 135.293 ms

new version
with 187x187x187 = 6539203 particles finished in 338.926 ms

first version
with 187x187x187 = 6539203 particles finished in 536.936 ms

new version
with 297x297x297 = 26198073 particles finished in 1.352 s

first version
with 297x297x297 = 26198073 particles finished in 2.177 s
codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 85.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 89.83%. Comparing base (35ed828) to head (d8ff817).

Files Patch % Lines
src/cell_lists/full_grid.jl 76.92% 3 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #50 +/- ## ========================================== + Coverage 89.81% 89.83% +0.02% ========================================== Files 16 16 Lines 481 482 +1 ========================================== + Hits 432 433 +1 Misses 49 49 ``` | [Flag](https://app.codecov.io/gh/trixi-framework/PointNeighbors.jl/pull/50/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=trixi-framework) | Coverage Δ | | |---|---|---| | [unit](https://app.codecov.io/gh/trixi-framework/PointNeighbors.jl/pull/50/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=trixi-framework) | `89.83% <85.00%> (+0.02%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=trixi-framework#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.