flux-framework / flux-core

core services for the Flux resource management framework
GNU Lesser General Public License v3.0
168 stars 50 forks source link

broker: detect mismatched bootstrap.hosts configuration #6393

Closed garlick closed 3 weeks ago

garlick commented 4 weeks ago

Problem: as noted in #6389, a mismatched bootstrap.hosts config can result in two hosts ping-ponging back and forth, connecting and causing the the other to be disconnected with a misleading message.

This adds the hostname to the overlay.hello RPC. The rank was already there. If the hostlist broker attribute does not confirm the rank to host mapping, deny the connection.

Marking as a WIP for now pending inspiration on how to adapt the issue test for #4182 (resource module re-ranking).

garlick commented 3 weeks ago

I pushed a change to that failing test so I think this is ready for a review.

garlick commented 3 weeks ago

Rebased on current master, addressed the codeql warning, and fixed another test that I'm not sure how I missed before.

codecov[bot] commented 3 weeks ago

Codecov Report

Attention: Patch coverage is 83.33333% with 4 lines in your changes missing coverage. Please review.

Project coverage is 53.93%. Comparing base (13fb49a) to head (3c9da5c).

Files with missing lines Patch % Lines
src/broker/broker.c 75.00% 2 Missing :warning:
src/broker/overlay.c 88.88% 1 Missing :warning:
src/cmd/flux-start.c 0.00% 1 Missing :warning:

:exclamation: There is a different number of reports uploaded between BASE (13fb49a) and HEAD (3c9da5c). Click for more details.

HEAD has 3 uploads less than BASE | Flag | BASE (13fb49a) | HEAD (3c9da5c) | |------|------|------| |ci-basic|3|0|
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #6393 +/- ## =========================================== - Coverage 83.59% 53.93% -29.66% =========================================== Files 524 476 -48 Lines 87615 80191 -7424 =========================================== - Hits 73242 43252 -29990 - Misses 14373 36939 +22566 ``` | [Files with missing lines](https://app.codecov.io/gh/flux-framework/flux-core/pull/6393?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework) | Coverage Δ | | |---|---|---| | [src/broker/boot\_config.c](https://app.codecov.io/gh/flux-framework/flux-core/pull/6393?src=pr&el=tree&filepath=src%2Fbroker%2Fboot_config.c&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework#diff-c3JjL2Jyb2tlci9ib290X2NvbmZpZy5j) | `70.29% <100.00%> (-10.82%)` | :arrow_down: | | [src/broker/boot\_pmi.c](https://app.codecov.io/gh/flux-framework/flux-core/pull/6393?src=pr&el=tree&filepath=src%2Fbroker%2Fboot_pmi.c&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework#diff-c3JjL2Jyb2tlci9ib290X3BtaS5j) | `55.55% <100.00%> (-10.36%)` | :arrow_down: | | [src/broker/overlay.c](https://app.codecov.io/gh/flux-framework/flux-core/pull/6393?src=pr&el=tree&filepath=src%2Fbroker%2Foverlay.c&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework#diff-c3JjL2Jyb2tlci9vdmVybGF5LmM=) | `58.84% <88.88%> (-24.80%)` | :arrow_down: | | [src/cmd/flux-start.c](https://app.codecov.io/gh/flux-framework/flux-core/pull/6393?src=pr&el=tree&filepath=src%2Fcmd%2Fflux-start.c&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework#diff-c3JjL2NtZC9mbHV4LXN0YXJ0LmM=) | `52.15% <0.00%> (-32.16%)` | :arrow_down: | | [src/broker/broker.c](https://app.codecov.io/gh/flux-framework/flux-core/pull/6393?src=pr&el=tree&filepath=src%2Fbroker%2Fbroker.c&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework#diff-c3JjL2Jyb2tlci9icm9rZXIuYw==) | `56.56% <75.00%> (-20.68%)` | :arrow_down: | ... and [442 files with indirect coverage changes](https://app.codecov.io/gh/flux-framework/flux-core/pull/6393/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=flux-framework)
garlick commented 3 weeks ago

Alright, setting MWP here.