plumed / plumed2

Development version of plumed 2
https://www.plumed.org
GNU Lesser General Public License v3.0
324 stars 269 forks source link

Added code to ensure that only those atoms that are used by actions t… #1029

Closed gtribello closed 3 months ago

gtribello commented 4 months ago
Description

I have added some code in DomainDecomposition::update that checks which actions apply forces to the atoms. A list is then built that ensures that atoms that are only used by actions with no forces added to not modify the force. If the number of active actions equals the number of actions that have forces then this rebuilding of the lists is not done. I checked the timings with this input:

WHOLEMOLECULES ENTITY0=1-10000
pos: POSITION ATOM=1
RESTRAINT ARG=pos.x AT=0.0 KAPPA=1

These are the timings before this change:

PLUMED:                                               Cycles        Total      Average      Minimum      Maximum
PLUMED:                                                    1     0.418250     0.418250     0.418250     0.418250
PLUMED: 1 Prepare dependencies                          2000     0.000198     0.000000     0.000000     0.000001
PLUMED: 2 Sharing data                                  2000     0.094563     0.000047     0.000046     0.000208
PLUMED: 3 Waiting for data                              2000     0.000508     0.000000     0.000000     0.000005
PLUMED: 4 Calculating (forward loop)                    2000     0.246271     0.000123     0.000120     0.000226
PLUMED: 5 Applying (backward loop)                      2000     0.048991     0.000024     0.000023     0.000207
PLUMED: 6 Update                                        2000     0.000083     0.000000     0.000000     0.000001

These are the timings after the change:

PLUMED:                                               Cycles        Total      Average      Minimum      Maximum
PLUMED:                                                    1     0.351232     0.351232     0.351232     0.351232
PLUMED: 1 Prepare dependencies                          2000     0.000170     0.000000     0.000000     0.000005
PLUMED: 2 Sharing data                                  2000     0.094641     0.000047     0.000046     0.000215
PLUMED: 3 Waiting for data                              2000     0.000486     0.000000     0.000000     0.000011
PLUMED: 4 Calculating (forward loop)                    2000     0.246078     0.000123     0.000120     0.000225
PLUMED: 5 Applying (backward loop)                      2000     0.001564     0.000001     0.000000     0.000203
PLUMED: 6 Update                                        2000     0.000056     0.000000     0.000000     0.000002

It is not a big difference but it helps a bit in this case. I think there are arguments both for and against doing this. I am not sure what to do.

Target release

I would like my code to appear in release XXXXX

Type of contribution
Copyright
Tests
codecov-commenter commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 83.34%. Comparing base (4ba422e) to head (24d4619).

:exclamation: Current head 24d4619 differs from pull request most recent head 79c9486. Consider uploading reports for the commit 79c9486 to get more accurate results

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1029 +/- ## ========================================== - Coverage 83.36% 83.34% -0.02% ========================================== Files 618 618 Lines 59103 59123 +20 ========================================== + Hits 49273 49278 +5 - Misses 9830 9845 +15 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

GiovanniBussi commented 4 months ago

Personally I think it's useful, because it makes the calculation faster. Perhaps this is an extreme case. However, it's common to have WHOLEMOLECULES applied to 100-1000 atoms and then only a few atoms (or tens of) with forces. How much would be the gain with something less extreme? Say 1000 atoms in WHOLEMOLECULES 10 of which biases

What happens if instead you have a comaparable number?

10000 atoms in WHOLEMOLECULES 10000 atoms biased

Do you have some heuristic to check when it's worth making the list?

In the normal list construction I was using some criterion like comparing the size of the largest group with the total number of atoms and skip doing the list if this is more than some percentage.

Then I have a little concern on code duplication. It might be nice if you can at least partly merge the construction of this list with the construction of true others. But maybe it's not necessary

gtribello commented 4 months ago

It does the following check

unsigned nforced_actions=0, nactive_actions=0;
      for(unsigned i=0; i<actions.size(); ++i) {
        if( actions[i]->isActive() ) {
          nactive_actions++;
          if( actions[i]->hasForce ) nforced_actions++;
        }
      }
      if( nforced_actions<nactive_actions ) {
           // REBUILD UNIQUE
      }

So it doesn't have a heuristic for checking it based on number of atoms. I guess you could do something like the following instead:

unsigned nforced_actions=0, nactive_actions=0;
      for(unsigned i=0; i<actions.size(); ++i) {
        if( actions[i]->isActive() ) {
          nactive_actions += actions[i]->getUniqueLocal().size();
          if( actions[i]->hasForce ) nforced_actions += actions[i]->getUniqueLocal();
        }
      }

The question then would be how to do the comparison to decide whether to rebuild unique or not. Any thoughts?

gtribello commented 3 months ago

HI @GiovanniBussi

I've updated this using the Heuristic you suggested on Wednesday. The speedup is the same as it was before this update. I hope that the decision as to whether to use this code is now made more sensibly, though. What do you think?

GiovanniBussi commented 3 months ago

@gtribello I would then merge the previous commit, without this unnecessary complication...

gtribello commented 3 months ago

HI @GiovanniBussi

I'm not totally convinced by any of this. I also don't think I explained very well in my previous message. I would like to explain how it works now and how it worked before. We can then make a decision on what to do. So here is the code after that last commit.

      unsigned nforced_atoms=0, nactive_atoms=0, nforced_actions=0;
      for(unsigned i=0; i<actions.size(); ++i) {
        if( actions[i]->isActive() ) {
          nactive_atoms += actions[i]->getUniqueLocal().size();;
          if( actions[i]->hasForce ) { nforced_actions++; nforced_atoms += actions[i]->getUniqueLocal().size(); }
        }
      }
      if( 2*nforced_atoms<nactive_atoms ) {
        std::vector<const std::vector<AtomNumber>*> vectors;
        vectors.reserve(nforced_actions);
        for(unsigned i=0; i<actions.size(); i++) {
          if( actions[i]->isActive() && actions[i]->hasForce && !actions[i]->getUnique().empty() ) {
            vectors.push_back(&actions[i]->getUniqueLocal());
            if( actions[i]->unique_local_needs_update ) actions[i]->updateUniqueLocal( !(dd && shuffledAtoms>0), g2l );
          }
        }
        if(vectors.empty()) return;
        unique.clear(); Tools::mergeSortedVectors(vectors,unique,getenvMergeVectorsPriorityQueue());
        uniq_index.resize(unique.size());
        if(!(int(gatindex.size())==getNumberOfAtoms() && shuffledAtoms==0)) {
          for(unsigned i=0; i<unique.size(); i++) uniq_index[i]=g2l[unique[i].index()];
        } else {
          for(unsigned i=0; i<unique.size(); i++) uniq_index[i]=unique[i].index();
        }
      }

This new version does:

The old version did:

The problem I have is suppose you have an input like this:

c1: ACTION_ONE ATOMS=1-100 ...
c2: ACTION_TWO ATOMS=1-100 ...
BIASVALUE ARG=c1

Here you want to avoid rebuilding the list to get optimum speed. The first version of the code above does not rebuild the list in this case (but that is lucky). The second version would rebuild the list. You are thus adding computational cost for this input unnecessarily. However, both version work do the right thing for the WHOLEMOLECULES case.

In short, I am still not convinced that either option is the right one. The first option is simple, though. What about doing it the first way and providing an option through an environment variable like PLUMED_FORCE_UNIQUE that allows folks to turn this off?

GiovanniBussi commented 3 months ago

OK I see, maybe we can discuss it tomorrow? Meanwhile, notice that I merged #1037, which makes merging vectors slightly faster. The syntax is modified (I moved the tool to a different file to avoid recompiling all PLUMED at every change), but you should just replace:

Tools::mergeSortedVectors

with

mergeVectorTools::mergeSortedVectors

and there's no option to switch on or off the priority queue implementation