cp2k / dbcsr

DBCSR: Distributed Block Compressed Sparse Row matrix library
https://cp2k.github.io/dbcsr/
GNU General Public License v2.0
135 stars 46 forks source link

Use memory pool also for Cannon send buffers #654

Closed oschuett closed 1 year ago

oschuett commented 1 year ago

Some MPI implementations have high overhead when encountering a new buffer.

For the H2O-DFT-LS benchmark this change yields up to 8% speedup with OpenMPI.

oschuett commented 1 year ago

@alazzaro, what do you think? I don't want to merge without your approval.

hfp commented 1 year ago

I guess this patch has no impact for MPICH/2 like Plain, Cray, or Intel MPI? I think confirming for just one of the MPICH implementations is sufficient.

alazzaro commented 1 year ago

yes, it can be merged. I've tested it. For it appears strange to me that we always overlooked it... thanks @oschuett for point it!