Open efaulhaber opened 2 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 87.60%. Comparing base (
989d0c0
) to head (ac18b79
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Based on #9.
For cheap artificial benchmarks like
count_neighbors
(see #18), the difference is huge. But it's more interesting to see real-life benchmarks, so I ran a 2D TLSPH benchmark (ofinteract!
) on a perturbed rectangular point cloud. The difference on a single thread is <1%:On 64 threads, we're talking about a 14% speedup for 6.5 million particles and 44% for 26 million particles.
Interestingly, there is absolutely no speedup in 3D. I'm assuming because there is more computation per particle-neighbor pair and the neighbor lists are >3x larger, so cache misses in the neighbor lists are insignificant. Note that this memory layout is also GPU-compatible, as opposed to the
Vector
ofVectors
.