Optimized getOutputQuantity to avoid string copies

plumed / plumed2

Development version of plumed 2

https://www.plumed.org

GNU Lesser General Public License v3.0

357 stars 284 forks source link

Optimized getOutputQuantity to avoid string copies #1016

Closed GiovanniBussi closed 8 months ago

GiovanniBussi commented 8 months ago

This change leads to a 4% speedup on my laptop with the small benchmark (20 atoms, 4 of which biased)

@gtribello can you confirm this is correct? Can I correctly assume that, if the values[i]->name is longer than getLabe(), than it starts with getLabel()+'.'? In this way I avoid creating a new std::string and directly compare the remaining part.

If not I can add a separate check on the first part of the string as well. In any case this will allow me to save the allocation.

codecov-commenter commented 8 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (2a3717a) 84.15% compared to head (7330913) 84.18%.

:exclamation: Current head 7330913 differs from pull request most recent head a8c16a0. Consider uploading reports for the commit a8c16a0 to get more accurate results

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #1016 +/- ## ========================================== + Coverage 84.15% 84.18% +0.02% ========================================== Files 612 612 Lines 56496 56497 +1 ========================================== + Hits 47547 47563 +16 + Misses 8949 8934 -15 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

carlocamilloni commented 8 months ago

@GiovanniBussi this is interesting because it has an impact also on the large simulation (110,000 atoms, saxs on ~6,000 atoms), it recovers the 2.5% loss but not speeding up the waiting data that is unchanged but speeding up quite a bit the update section, while the impact on alaDP is tiny, ~1%

GiovanniBussi commented 8 months ago

Unexpected. Maybe you have many actions? My understanding is that this function, with a string argument, is called at each step over all actions to check if they contain a "work" or "bias" named value.

carlocamilloni commented 8 months ago

I have many components, this is SAXS+Metainference so I have ~30 components from SAXS and BIASVALUE applied on one of these many

GiovanniBussi commented 8 months ago

So I think I can optimize this further. The search is done with a loop now, but I think we should store a unordered_map with all value names if we really want to retrieve them by name during the simulation

gtribello commented 8 months ago

@GiovanniBussi

I think this is OK. Could there be an easier (and possibly faster) way to do this as follows by always creating a combine action in PLUMED after all the actions are read in like this:

total_bias: COMBINE ARG=*.bias PERIODIC=NO

You then set the value of bias equal to the total bias.

You could perhaps do the the first time getBias is called. If you did this then you can use a GET command to pass the final bias from plumed back to the MD code. Also if the MD code doesn't take back the bias then the total bias is not even calculated.

The only problem with this would be if you have biases that are not calculated on every step. It would be easy to solve that one by adding a flag in ActionPilot that ensures that the GET action pilot doesn't activate its dependencies. For all the non-active bias values you would thus be adding zeros.

GiovanniBussi commented 8 months ago

@gtribello I am not sure what you propose is optimal because it would not allow adding new actions after the simulation is started.

What about keeping track of which actions have such components by something similar to the pilots member of PlumedMain? Which is the best way to inquire if a given OutputComponent is defined?

Meanwhile I will merge this, that in any case speeds up the search for the proper component