plumed / plumed2

Development version of plumed 2
https://www.plumed.org
GNU Lesser General Public License v3.0
324 stars 269 forks source link

Optimize get words #1015

Closed GiovanniBussi closed 4 months ago

GiovanniBussi commented 5 months ago

I managed to optimize getWords.

@carlocamilloni can you also check the performance in this branch?

In short, there's a new getWords written ad hoc for plumed_cmd strings, which is significantly faster for the following two reasons:

  1. I removed the possibility to use complex syntax (parenthesis, tabs etc) in cmd string. I think this is fine, please let me know if anyone has concerns.
  2. I used a small_vector<std::string_view> to avoid memory allocations. std::string_view is c++17, small_vector is an external library that I included here, similarly to how we include lepton and other libraries. Notice that the small_vector library triggers a bug with gcc 8 that I fixed here.

Before merging this branch I want to check with the author of the small_vector library if my fix is ok. To me, small_vector is a very useful tool that might allow optimizations in many places in PLUMED (basically, whenever we waste time allocating small temporary vectors). If I cannot sort this out, I can easily reimplement the optimization of getWords using a fixed size std::array (say with a maximum size of 4 words), that would work certainly in case of interpreting plumed_cmd strings.

This PR closes #1011

codecov-commenter commented 5 months ago

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (f210467) 84.18% compared to head (89193cd) 84.17%.

:exclamation: Current head 89193cd differs from pull request most recent head 0329172. Consider uploading reports for the commit 0329172 to get more accurate results

Files Patch % Lines
src/core/PlumedMain.cpp 90.90% 2 Missing :warning:
src/tools/Tools.cpp 91.66% 1 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1015 +/- ## ========================================== - Coverage 84.18% 84.17% -0.02% ========================================== Files 612 613 +1 Lines 56497 56568 +71 ========================================== + Hits 47563 47614 +51 - Misses 8934 8954 +20 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

carlocamilloni commented 5 months ago

@GiovanniBussi confirmed, with 11,500 ns/day vs 10,600 ns/day without this and 10,700 ns/day before htt. Here the gain appears in the total time (~9 s vs 10.6 s) but not in the detailed timers.

carlocamilloni commented 5 months ago

as a comparison without plumed we are 25,000 ns/day

GiovanniBussi commented 5 months ago

Here the gain appears in the total time (~9 s vs 10.6 s) but not in the detailed timers

This is expected. I think we are not including in the detailed timers what happens between when we call plumed_cmd and we start doing something. We always assumed this part of the code is not impacting performances.

I think this mostly relevant for very small (and irrelevant :-)) systems, but still it's a good exercise in hunting for bottlenecks.

carlocamilloni commented 5 months ago

i agree I tested a 110,000 atom using saxs (the one for which we had 2.5% loss of performances by moving to htt) and the loss of performance is still exactly 2.5% irrespectively of all the optimisations done from that time. The main difference is in the "waiting data" (left is today, middle is before htt, with the htt at that time)

Screenshot 2024-02-02 at 18 27 41

but this test is short (250 ps using 4 openMP threads)