flux-framework / flux-pmix

flux shell plugin to bootstrap openmpi v5+
GNU Lesser General Public License v3.0
2 stars 4 forks source link

resolve ompi fence hang #29

Closed garlick closed 3 years ago

garlick commented 3 years ago

I'm not sure this is the right solution to #27 but here it is for now.

The main "fix" is just to set OMPI_MCA_btl_tcp_if_include=lo in the test environment so that ompi ranks that are on different shells, but the same node don't get confused trying to find a working interface to communicate with tcp.

There's also a memory leak I happened across in the fence code, and some fence trace output improvement.

Finally some test expansion for the n2p3 case that is of dubious value (but I figured didn't hurt to leave in).

codecov[bot] commented 3 years ago

Codecov Report

Merging #29 (ff6adbc) into main (cc276d6) will increase coverage by 0.12%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #29      +/-   ##
==========================================
+ Coverage   78.27%   78.40%   +0.12%     
==========================================
  Files          10       10              
  Lines        1183     1190       +7     
==========================================
+ Hits          926      933       +7     
  Misses        257      257              
Impacted Files Coverage Δ
src/shell/plugins/fence.c 84.11% <100.00%> (+0.45%) :arrow_up:
src/shell/plugins/main.c 78.57% <100.00%> (+0.57%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update cc276d6...ff6adbc. Read the comment docs.

garlick commented 3 years ago

Thanks!