open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.17k stars 860 forks source link

Unrecognized FT TYPE: LAM #9782

Closed tonideleo closed 2 years ago

tonideleo commented 2 years ago

Background information

I am trying to compile the new ompi 5.0.0rc2 to test the fault tolerance capability; however, at the end of ./configure --with-ft=LAM --prefix=... I get the following error:

*** Fault tolerance
checking if want fault tolerance... Support for C/R FT has been removed in OMPI 5.0
Unrecognized FT TYPE: LAM
configure: error: Cannot continue

Am I missing something? I thought the options were either LAM or cr.

Thank you very much for your time!

System Information

The machine, using uname -a:

Linux n3049 4.18.0-305.19.1.el8_4.x86_64 #1 SMP Wed Sep 15 19:12:32 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

The CPU, using lscpu:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              40
On-line CPU(s) list: 0-39
Thread(s) per core:  1
Core(s) per socket:  20
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
Stepping:            7
CPU MHz:             800.287
BogoMIPS:            4200.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            28160K
NUMA node0 CPU(s):   0-19
NUMA node1 CPU(s):   20-39

Network Info using lspci | egrep -i --color 'network|ethernet':

31:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GBASE-T (rev 09)
31:00.1 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GBASE-T (rev 09)
bosilca commented 2 years ago

OMPI lost support for cr or LAM type of fault management. The only supported type is based on ULFM, enabled with --with-ft=mpi.