espressomd / espresso

The ESPResSo package
https://espressomd.org
GNU General Public License v3.0
222 stars 183 forks source link

LB GPU communicator #4919

Closed jngrad closed 1 month ago

jngrad commented 1 month ago

Description of changes:

jngrad commented 1 month ago

Weak scaling:

mpiexec -n 1 ./pypresso ../maintainer/benchmarks/lb.py --particles_per_core 1000 --lb_sites_per_particle 64
CPU: 9.4ms/loop
GPU before PR: 4.5ms/loop
GPU after PR: 4.5ms/loop

mpiexec -n 2 ./pypresso ../maintainer/benchmarks/lb.py --particles_per_core 1000 --lb_sites_per_particle 64
CPU: 14.9ms/loop
GPU after PR: 13.6ms/loop
GPU before PR: 63.6ms/loop

mpiexec -n 4 ./pypresso ../maintainer/benchmarks/lb.py --particles_per_core 1000 --lb_sites_per_particle 64
CPU: 17.2ms/loop
GPU after PR: 22.2ms/loop
GPU before PR: 138.7ms/loop

The speed remains unchanged on 1 MPI rank, because GPUPackInfo already implemented a bufferless device-to-device copy operation when the send and receive blocks belong to the same MPI rank.