flux-framework / flux-core

core services for the Flux resource management framework
GNU Lesser General Public License v3.0
159 stars 49 forks source link

speed up flux overlay status on a big system #6030

Closed garlick closed 3 weeks ago

garlick commented 3 weeks ago

Problem: flux overlay status makes some unnecessary RPCs that slow things way down on el cap. Specifically, the overlay.health RPC is made to leaf nodes despite the fact that the parent already knows the status of its children.

Eliminating that gets us about a 4x speedup on el cap.

before

[garlick@elcap1:cmd]$ time /usr/bin/flux overlay status >/dev/null

real    1m28.377s
user    0m56.726s
sys 0m0.905s

After

[garlick@elcap1:cmd]$ time ./flux overlay status >/dev/null

real    0m19.892s
user    0m18.275s
sys 0m0.209s
garlick commented 3 weeks ago

Fixed that test and I'll set MWP. Thanks!

codecov[bot] commented 3 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 83.29%. Comparing base (0175d34) to head (919e71d).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #6030 +/- ## =========================================== + Coverage 54.39% 83.29% +28.89% =========================================== Files 471 519 +48 Lines 76251 83654 +7403 =========================================== + Hits 41476 69676 +28200 + Misses 34775 13978 -20797 ``` | [Files](https://app.codecov.io/gh/flux-framework/flux-core/pull/6030?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework) | Coverage Δ | | |---|---|---| | [src/cmd/builtin/overlay.c](https://app.codecov.io/gh/flux-framework/flux-core/pull/6030?src=pr&el=tree&filepath=src%2Fcmd%2Fbuiltin%2Foverlay.c&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework#diff-c3JjL2NtZC9idWlsdGluL292ZXJsYXkuYw==) | `91.78% <100.00%> (+90.31%)` | :arrow_up: | ... and [438 files with indirect coverage changes](https://app.codecov.io/gh/flux-framework/flux-core/pull/6030/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework)