plumed / plumed2

Development version of plumed 2
https://www.plumed.org
GNU Lesser General Public License v3.0
364 stars 289 forks source link

Small optimization in CoordinationBase #1096

Closed Iximiel closed 3 months ago

Iximiel commented 4 months ago
Description

I changed how the couples are traversed in the CoordinationBase calculate() method, by passing from a "striped" traversal to a "block" one. I get a 3%-5% speedup on my local machine.


Here are some results I got on my workstation, the graph show the time spent in calculate() relative to the time spent in calculate in the master branch

for 4 mpi processes and 2 threads each one image

for 1 process with 8 threads (Here I did not expect performance improvement) image


Before removing the draft status I'd like to see if it is possible to not store the couples in the neighbor list if the user asks for a fixed list

Target release

I would like my code to appear in release 2.10 (but I think this can rebased on 2.9)

Type of contribution
Copyright
Tests
GiovanniBussi commented 4 months ago

I think you get an improvement also with pure openmp because also in that case the access to continuous memory is faster

Iximiel commented 3 months ago

Before removing the draft status I'd like to see if it is possible to not store the couples in the neighbor list if the user asks for a fixed list

I think it may be a better idea to do that in a separate PR

If you think it may be better to try another benchmark I'll do it, otherwise, I think it can be merged/rebased on 2.9

GiovanniBussi commented 3 months ago

I think we can merge it. It can even go to v2.9 because the change is minimal.

Iximiel commented 3 months ago

Ok, then I'm rebasing this to the v2.9

Iximiel commented 3 months ago

I changed the base and I pushed for the CI. while the CI is running I'm also comparing the branch with the master v2.9