Open GiovanniBussi opened 7 months ago
Hello
I had a look at this and if you are using domain decomposition it is necessary to have that call twice per step.
There is a difference between what needs to be in unique
in DomainDecomposition::share()
and in DomainDecomposition::reset()
. In particular:
DomainDecomposition::share()
unique needs to contain only the atoms in the local domainDomainDecomposition::reset()
unique needs to contain all the atoms that plumed may have added force on (which Is all the atoms in all the domains).If the DomainDecomposition is off then you can avoid calling getAllActiveAtoms
in reset. If that is not the case then I think there is no way to avoid the second call.
Addressed in #1027
I leave this open to remember that we should check the impact in MPI runs and, if necessary, address it
@gtribello I am looking for hot spots here and there and I discovered that there is a potentially expensive thing that is done twice per step with htt (once per step before).
In particular, in DomainDecomposition.cpp, the operation
mergeSortedVectors
is done:DomainDecomposition::share()
(equivalent to pre-htt), for local (in the MPI sense) atomsDomainDecomposition::reset()
for all (non local) atomsIs this expected? If I just remove the call to
getAllActiveAtoms
inreset()
, I get a speed up of ~ 10% with this input:However, the code is not working anymore correctly when we use domain decomposition. Which is the correct way of removing this unnecessary calculation? Why do we need to set to zero forces that are expected to be never used on the local processor?