open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.17k stars 861 forks source link

Using openmpi to cluster parallel computing on openfoam, the solver execution files of other nodes can not be found #9054

Closed qingfengfenga closed 3 years ago

qingfengfenga commented 3 years ago

I use two nodes to do parallel computing test of openfoam cluster

SSH password free and shared folder have been configured

192.168.90.41   node2
192.168.90.28   node1

I've seen this problem, but it doesn't help me #6293

Background information

OpenMPI version

node1

$ mpirun --version
mpirun (Open MPI) 2.1.1

Report bugs to http://www.open-mpi.org/community/help/

$ mpirun --version
mpirun (Open MPI) 2.1.1

Report bugs to http://www.open-mpi.org/community/help/

How to install it

node1/node2

Openmpi installed by source / distribution

On what system

node1

* Operating system/version:
$ cat /proc/version
Linux version 5.4.0-74-generic (buildd@lcy01-amd64-023) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #83~18.04.1-Ubuntu SMP Tue May 11 16:01:00 UTC 2021

* Computer hardware: 
$ lscpu
Architecture:           x86_64
CPU op-mode(s):   32-bit, 64-bit
Byte Order:         Little Endian
CPU:             16
On-line CPU(s) list:  0-15
Thread(s) per core: 2
Core(s) per socket:   8
Socket(s):             1
NUMA node(s):      1
Vendor ID:        AuthenticAMD
CPU family:       23
Model:           113
Model name:       AMD Ryzen 7 3700X 8-Core Processor
Stepping:           0
CPU MHz:        2195.712
CPU 最大 MHz:   3600.0000
CPU 最小 MHz:   2200.0000
BogoMIPS:       7186.50
Virtualization:         AMD-V
L1d cache:       32K
L1i cache:       32K
L2 cache:        512K
L3 cache:        16384K
NUMA 节点0 CPU: 0-15
Flags:           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca

* Computer hardware: 
$ ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:cb:e8:60:de  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp7s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.90.28  netmask 255.255.255.0  broadcast 192.168.90.255
        inet6 fe80::1d06:279e:ae6b:d388  prefixlen 64  scopeid 0x20<link>
        ether 04:d9:f5:84:d3:18  txqueuelen 1000  (Ethernet)
        RX packets 845713  bytes 99197634 (99.1 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 730271  bytes 45604796 (45.6 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 11262  bytes 1293924 (1.2 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 11262  bytes 1293924 (1.2 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

node2

* Operating system/version: 
$ cat /proc/version
Linux version 5.4.0-42-generic (buildd@lgw01-amd64-023) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #46~18.04.1-Ubuntu SMP Fri Jul 10 07:21:24 UTC 2020

* Computer hardware:
$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               158
Model name:          Intel(R) Xeon(R) CPU E3-1220 v6 @ 3.00GHz
Stepping:            9
CPU MHz:             1000.089
CPU max MHz:         3500.0000
CPU min MHz:         800.0000
BogoMIPS:            6000.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            8192K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d

* Network type: 
$ ifconfig
eno1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.90.41  netmask 255.255.255.0  broadcast 192.168.90.255
        inet6 fe80::dd3e:bfcb:7140:3162  prefixlen 64  scopeid 0x20<link>
        ether 50:9a:4c:92:0b:09  txqueuelen 1000  (Ethernet)
        RX packets 6738364  bytes 2054896923 (2.0 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 4029719  bytes 316496074 (316.4 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 16  

eno2: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 50:9a:4c:92:0b:0a  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 17  

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 11190  bytes 1067853 (1.0 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 11190  bytes 1067853 (1.0 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

OpenMPI Configuration

node1

$ ompi_info
                 Package: Open MPI buildd@lcy01-amd64-009 Distribution
                Open MPI: 2.1.1
  Open MPI repo revision: v2.1.0-100-ga2fdb5b
   Open MPI release date: May 10, 2017
                Open RTE: 2.1.1
  Open RTE repo revision: v2.1.0-100-ga2fdb5b
   Open RTE release date: May 10, 2017
                    OPAL: 2.1.1
      OPAL repo revision: v2.1.0-100-ga2fdb5b
       OPAL release date: May 10, 2017
                 MPI API: 3.1.0
            Ident string: 2.1.1
                  Prefix: /usr
 Configured architecture: x86_64-pc-linux-gnu
          Configure host: lcy01-amd64-009
           Configured by: buildd
           Configured on: Mon Feb  5 19:59:59 UTC 2018
          Configure host: lcy01-amd64-009
                Built by: buildd
                Built on: Mon Feb  5 20:05:56 UTC 2018
              Built host: lcy01-amd64-009
              C bindings: yes
            C++ bindings: yes
             Fort mpif.h: yes (all)
            Fort use mpi: yes (full: ignore TKR)
       Fort use mpi size: deprecated-ompi-info-value
        Fort use mpi_f08: yes
 Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
                          limitations in the gfortran compiler, does not
                          support the following: array subsections, direct
                          passthru (where possible) to underlying Open MPI's
                          C functionality
  Fort mpi_f08 subarrays: no
           Java bindings: yes
  Wrapper compiler rpath: disabled
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
  C compiler family name: GNU
      C compiler version: 7.3.0
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
           Fort compiler: gfortran
       Fort compiler abs: /usr/bin/gfortran
         Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
   Fort 08 assumed shape: yes
      Fort optional args: yes
          Fort INTERFACE: yes
    Fort ISO_FORTRAN_ENV: yes
       Fort STORAGE_SIZE: yes
      Fort BIND(C) (all): yes
      Fort ISO_C_BINDING: yes
 Fort SUBROUTINE BIND(C): yes
       Fort TYPE,BIND(C): yes
 Fort T,BIND(C,name="a"): yes
            Fort PRIVATE: yes
          Fort PROTECTED: yes
           Fort ABSTRACT: yes
       Fort ASYNCHRONOUS: yes
          Fort PROCEDURE: yes
         Fort USE...ONLY: yes
           Fort C_FUNLOC: yes
 Fort f08 using wrappers: yes
         Fort MPI_SIZEOF: yes
             C profiling: yes
           C++ profiling: yes
   Fort mpif.h profiling: yes
  Fort use mpi profiling: yes
   Fort use mpi_f08 prof: yes
          C++ exceptions: no
          Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes,
                          OMPI progress: no, ORTE progress: yes, Event lib:
                          yes)
           Sparse Groups: no
  Internal debug support: no
  MPI interface warnings: yes
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
              dl support: yes
   Heterogeneous support: yes
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: native
     Symbol vis. support: yes
   Host topology support: yes
          MPI extensions: affinity, cuda
  MPI_MAX_PROCESSOR_NAME: 256
    MPI_MAX_ERROR_STRING: 256
     MPI_MAX_OBJECT_NAME: 64
        MPI_MAX_INFO_KEY: 36
        MPI_MAX_INFO_VAL: 256
       MPI_MAX_PORT_NAME: 1024
  MPI_MAX_DATAREP_STRING: 128
           MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v2.1.1)
           MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v2.1.1)
           MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA btl: openib (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: tcp (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: vader (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: sm (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: self (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                  MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA event: libevent2022 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA hwloc: external (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                  MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                  MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
         MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v2.1.1)
         MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA mpool: hugepage (MCA v2.1.0, API v3.0.0, Component v2.1.1)
             MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
                          v2.1.1)
                MCA pmix: pmix112 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA pstat: linux (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v2.1.1)
                 MCA sec: basic (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA dfs: test (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA dfs: app (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA dfs: orted (MCA v2.1.0, API v1.0.0, Component v2.1.1)
              MCA errmgr: default_app (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
              MCA errmgr: default_tool (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
              MCA errmgr: default_orted (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
              MCA errmgr: default_hnp (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
                 MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: env (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: singleton (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
                 MCA ess: hnp (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: tool (MCA v2.1.0, API v3.0.0, Component v2.1.1)
               MCA filem: raw (MCA v2.1.0, API v2.0.0, Component v2.1.1)
             MCA grpcomm: direct (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA iof: mr_orted (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: orted (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: mr_hnp (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: tool (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: hnp (MCA v2.1.0, API v2.0.0, Component v2.1.1)
            MCA notifier: syslog (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                MCA odls: default (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA oob: tcp (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA oob: usock (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA oob: ud (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA plm: isolated (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA plm: rsh (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA ras: loadleveler (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                 MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                 MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA ras: simulator (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA rmaps: ppr (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: resilient (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA rmaps: rank_file (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA rmaps: seq (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: staged (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: mindist (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: round_robin (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                 MCA rml: oob (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: debruijn (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: binomial (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: radix (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: direct (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA rtc: freq (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA rtc: hwloc (MCA v2.1.0, API v1.0.0, Component v2.1.1)
              MCA schizo: ompi (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: staged_hnp (MCA v2.1.0, API v1.0.0, Component
                          v2.1.1)
               MCA state: dvm (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: novm (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: tool (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: staged_orted (MCA v2.1.0, API v1.0.0, Component
                          v2.1.1)
               MCA state: orted (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: app (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: hnp (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA bml: r2 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: basic (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: libnbc (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: self (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: inter (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: tuned (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: sync (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: sm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA fcoll: two_phase (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA fcoll: static (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                  MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                  MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                  MCA io: romio314 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA mtl: ofi (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA mtl: psm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA osc: pt2pt (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA pml: v (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA rte: orte (MCA v2.1.0, API v2.0.0, Component v2.1.1)
            MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
            MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
            MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v2.1.1)
           MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)

node2

$ ompi_info
                 Package: Open MPI buildd@lcy01-amd64-009 Distribution
                Open MPI: 2.1.1
  Open MPI repo revision: v2.1.0-100-ga2fdb5b
   Open MPI release date: May 10, 2017
                Open RTE: 2.1.1
  Open RTE repo revision: v2.1.0-100-ga2fdb5b
   Open RTE release date: May 10, 2017
                    OPAL: 2.1.1
      OPAL repo revision: v2.1.0-100-ga2fdb5b
       OPAL release date: May 10, 2017
                 MPI API: 3.1.0
            Ident string: 2.1.1
                  Prefix: /usr
 Configured architecture: x86_64-pc-linux-gnu
          Configure host: lcy01-amd64-009
           Configured by: buildd
           Configured on: Mon Feb  5 19:59:59 UTC 2018
          Configure host: lcy01-amd64-009
                Built by: buildd
                Built on: Mon Feb  5 20:05:56 UTC 2018
              Built host: lcy01-amd64-009
              C bindings: yes
            C++ bindings: yes
             Fort mpif.h: yes (all)
            Fort use mpi: yes (full: ignore TKR)
       Fort use mpi size: deprecated-ompi-info-value
        Fort use mpi_f08: yes
 Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
                          limitations in the gfortran compiler, does not
                          support the following: array subsections, direct
                          passthru (where possible) to underlying Open MPI's
                          C functionality
  Fort mpi_f08 subarrays: no
           Java bindings: yes
  Wrapper compiler rpath: disabled
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
  C compiler family name: GNU
      C compiler version: 7.3.0
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
           Fort compiler: gfortran
       Fort compiler abs: /usr/bin/gfortran
         Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
   Fort 08 assumed shape: yes
      Fort optional args: yes
          Fort INTERFACE: yes
    Fort ISO_FORTRAN_ENV: yes
       Fort STORAGE_SIZE: yes
      Fort BIND(C) (all): yes
      Fort ISO_C_BINDING: yes
 Fort SUBROUTINE BIND(C): yes
       Fort TYPE,BIND(C): yes
 Fort T,BIND(C,name="a"): yes
            Fort PRIVATE: yes
          Fort PROTECTED: yes
           Fort ABSTRACT: yes
       Fort ASYNCHRONOUS: yes
          Fort PROCEDURE: yes
         Fort USE...ONLY: yes
           Fort C_FUNLOC: yes
 Fort f08 using wrappers: yes
         Fort MPI_SIZEOF: yes
             C profiling: yes
           C++ profiling: yes
   Fort mpif.h profiling: yes
  Fort use mpi profiling: yes
   Fort use mpi_f08 prof: yes
          C++ exceptions: no
          Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes,
                          OMPI progress: no, ORTE progress: yes, Event lib:
                          yes)
           Sparse Groups: no
  Internal debug support: no
  MPI interface warnings: yes
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
              dl support: yes
   Heterogeneous support: yes
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: native
     Symbol vis. support: yes
   Host topology support: yes
          MPI extensions: affinity, cuda
  MPI_MAX_PROCESSOR_NAME: 256
    MPI_MAX_ERROR_STRING: 256
     MPI_MAX_OBJECT_NAME: 64
        MPI_MAX_INFO_KEY: 36
        MPI_MAX_INFO_VAL: 256
       MPI_MAX_PORT_NAME: 1024
  MPI_MAX_DATAREP_STRING: 128
           MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v2.1.1)
           MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v2.1.1)
           MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA btl: sm (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: tcp (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: self (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: openib (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA btl: vader (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                  MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA event: libevent2022 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA hwloc: external (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                  MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                  MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
         MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v2.1.1)
         MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA mpool: hugepage (MCA v2.1.0, API v3.0.0, Component v2.1.1)
             MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
                          v2.1.1)
                MCA pmix: pmix112 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA pstat: linux (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v2.1.1)
                 MCA sec: basic (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA dfs: test (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA dfs: app (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA dfs: orted (MCA v2.1.0, API v1.0.0, Component v2.1.1)
              MCA errmgr: default_orted (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
              MCA errmgr: default_app (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
              MCA errmgr: default_tool (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
              MCA errmgr: default_hnp (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
                 MCA ess: tool (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: hnp (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: singleton (MCA v2.1.0, API v3.0.0, Component
                          v2.1.1)
                 MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA ess: env (MCA v2.1.0, API v3.0.0, Component v2.1.1)
               MCA filem: raw (MCA v2.1.0, API v2.0.0, Component v2.1.1)
             MCA grpcomm: direct (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA iof: mr_hnp (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: orted (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: tool (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: mr_orted (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA iof: hnp (MCA v2.1.0, API v2.0.0, Component v2.1.1)
            MCA notifier: syslog (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                MCA odls: default (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA oob: ud (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA oob: usock (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA oob: tcp (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA plm: isolated (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA plm: rsh (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                 MCA ras: loadleveler (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                 MCA ras: simulator (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                 MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: rank_file (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA rmaps: resilient (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA rmaps: round_robin (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA rmaps: mindist (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: seq (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: staged (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA rmaps: ppr (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA rml: oob (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: direct (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: radix (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: debruijn (MCA v2.1.0, API v2.0.0, Component v2.1.1)
              MCA routed: binomial (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA rtc: hwloc (MCA v2.1.0, API v1.0.0, Component v2.1.1)
                 MCA rtc: freq (MCA v2.1.0, API v1.0.0, Component v2.1.1)
              MCA schizo: ompi (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: novm (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: staged_orted (MCA v2.1.0, API v1.0.0, Component
                          v2.1.1)
               MCA state: hnp (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: orted (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: tool (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: app (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: dvm (MCA v2.1.0, API v1.0.0, Component v2.1.1)
               MCA state: staged_hnp (MCA v2.1.0, API v1.0.0, Component
                          v2.1.1)
                 MCA bml: r2 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: sync (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: tuned (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: basic (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: libnbc (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: sm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: inter (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA coll: self (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA fcoll: static (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v2.1.1)
               MCA fcoll: two_phase (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
               MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                  MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                  MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                  MCA io: romio314 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA mtl: ofi (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA mtl: psm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA osc: pt2pt (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v2.1.1)
                 MCA pml: v (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA rte: orte (MCA v2.1.0, API v2.0.0, Component v2.1.1)
            MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v2.1.1)
            MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
            MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)
                MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v2.1.1)
           MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
                          v2.1.1)

Details of the problem

The official document is too simple. I sent an email to openfoam.org asking for related questions. They said that they are compatible with openmpi. They have many clusters running normally. Let me check them again. They can't provide substantive help.

Single node computing, single node multi-core parallel computing, are no problem

# mpirun --allow-run-as-root -np 8 icoFoam -parallel
/*---------------------------------------------------------------------------*\
  =========                 |
  \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox
   \\    /   O peration     | Website:  https://openfoam.org
    \\  /    A nd           | Version:  8
     \\/     M anipulation  |
\*---------------------------------------------------------------------------*/
Build  : 8-1c9b5879390b
Exec   : icoFoam -parallel
Date   : Jun 09 2021
Time   : 11:37:38
Host   : "digitwin-System-Product-Name"
PID    : 14001
I/O    : uncollated
Case   : /tmp/penn/cavity
nProcs : 8
Slaves : 
7
(
"digitwin-System-Product-Name.14002"
"digitwin-System-Product-Name.14003"
"digitwin-System-Product-Name.14004"
"digitwin-System-Product-Name.14005"
"digitwin-System-Product-Name.14006"
"digitwin-System-Product-Name.14007"
"digitwin-System-Product-Name.14008"
)

Pstream initialized with:
    floatTransfer      : 0
    nProcsSimpleSum    : 0
    commsType          : nonBlocking
    polling iterations : 0
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10)
allowSystemOperations : Allowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 0

Reading transportProperties

Reading field p

Reading field U

Reading/calculating face flux field phi

Starting time loop

Time = 0.005

Courant Number mean: 0 max: 0
smoothSolver:  Solving for Ux, Initial residual = 1, Final residual = 9.65374e-06, No Iterations 25
smoothSolver:  Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
DICPCG:  Solving for p, Initial residual = 1, Final residual = 0.0414228, No Iterations 26
time step continuity errors : sum local = 0.000276791, global = 5.29396e-20, cumulative = 5.29396e-20
DICPCG:  Solving for p, Initial residual = 0.470734, Final residual = 6.39106e-07, No Iterations 46
time step continuity errors : sum local = 8.09622e-09, global = 5.45774e-19, cumulative = 5.98713e-19
ExecutionTime = 0.01 s  ClockTime = 0 s
......
......

Time = 0.5

Courant Number mean: 0.222158 max: 0.852134
smoothSolver:  Solving for Ux, Initial residual = 2.3496e-07, Final residual = 2.3496e-07, No Iterations 0
smoothSolver:  Solving for Uy, Initial residual = 5.12666e-07, Final residual = 5.12666e-07, No Iterations 0
DICPCG:  Solving for p, Initial residual = 9.22494e-07, Final residual = 9.22494e-07, No Iterations 0
time step continuity errors : sum local = 9.36271e-09, global = 6.57774e-19, cumulative = 6.87762e-17
DICPCG:  Solving for p, Initial residual = 9.70812e-07, Final residual = 9.70812e-07, No Iterations 0
time step continuity errors : sum local = 9.76463e-09, global = -7.94093e-21, cumulative = 6.87682e-17
ExecutionTime = 0.11 s  ClockTime = 1 s

End

Finalising parallel run

When I run openfoam multi node parallel computing, it prompts that I can't find the solver executable file of other nodes. When I log in to this machine to check, I find that this file exists, it can run normally, and the path is correct.

$ mpirun --allow-run-as-root --hostfile machines -np 8 icoFoam -parallel
[digitwin-System-Product-Name:13938] [[16385,0],0] usock_peer_send_blocking: send() to socket 48 failed: Broken pipe (32)
[digitwin-System-Product-Name:13938] [[16385,0],0] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 316
[digitwin-System-Product-Name:13938] [[16385,0],0]-[[16385,1],4] usock_peer_accept: usock_peer_send_connect_ack failed
[digitwin-System-Product-Name:13938] [[16385,0],0] usock_peer_send_blocking: send() to socket 49 failed: Broken pipe (32)
[digitwin-System-Product-Name:13938] [[16385,0],0] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 316
[digitwin-System-Product-Name:13938] [[16385,0],0]-[[16385,1],6] usock_peer_accept: usock_peer_send_connect_ack failed
[digitwin-System-Product-Name:13938] [[16385,0],0] usock_peer_send_blocking: send() to socket 50 failed: Broken pipe (32)
[digitwin-System-Product-Name:13938] [[16385,0],0] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 316
[digitwin-System-Product-Name:13938] [[16385,0],0]-[[16385,1],7] usock_peer_accept: usock_peer_send_connect_ack failed
[digitwin-System-Product-Name:13938] [[16385,0],0] usock_peer_send_blocking: send() to socket 51 failed: Broken pipe (32)
[digitwin-System-Product-Name:13938] [[16385,0],0] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 316
[digitwin-System-Product-Name:13938] [[16385,0],0]-[[16385,1],5] usock_peer_accept: usock_peer_send_connect_ack failed
--------------------------------------------------------------------------
mpirun was unable to find the specified executable file, and therefore
did not launch the job.  This error was first reported for process
rank 0; it may have occurred for other processes as well.

NOTE: A common cause for this error is misspelling a mpirun command
      line parameter option (remember that mpirun interprets the first
      unrecognized command line token as the executable).

Node:       node2
Executable: /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
--------------------------------------------------------------------------
4 total processes failed to start

node1:$ ls /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam

node1:$ ssh node2
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-42-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

 * Canonical Livepatch is available for installation.
   - Reduce system reboots and improve kernel security. Activate at:
     https://ubuntu.com/livepatch

7 updates can be applied immediately.
1 of these updates is a standard security update.
To see these additional updates run: apt list --upgradable

New release '20.04.2 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Your Hardware Enablement Stack (HWE) is supported until April 2023.
*** System restart required ***
Last login: Wed Jun  9 10:56:52 2021 from 192.168.91.254

node2:$ ls /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam

When I run multi node parallel computing and the number of cores does not exceed node1, it is normal. In other words, openmpi cannot call the core on the network. But when I run the openmip test, it's normal

OpenMPI test shows that it is normal

$ cat machines 
node2 cpu = 4
node1 cpu = 4

# Test on physical computer
$ mpirun --allow-run-as-root --hostfile machines -np 8 sh -c 'echo $(hostname):hello'
digitwin-System-Product-Name:hello
digitwin-System-Product-Name:hello
digitwin-System-Product-Name:hello
digitwin-System-Product-Name:hello
dt-PowerEdge-T330:hello
dt-PowerEdge-T330:hello
dt-PowerEdge-T330:hello
dt-PowerEdge-T330:hello

# Testing on virtual machine
$ mpirun --allow-run-as-root --hostfile machines -n 8 ./mpi_hello_world
Hello world from processor dyfluid, rank 1 out of 8 processors
Hello world from processor dyfluid, rank 0 out of 8 processors
Hello world from processor dyfluid, rank 2 out of 8 processors
Hello world from processor dyfluid, rank 3 out of 8 processors
Hello world from processor dyfluid, rank 5 out of 8 processors
Hello world from processor dyfluid, rank 6 out of 8 processors
Hello world from processor dyfluid, rank 4 out of 8 processors
Hello world from processor dyfluid, rank 7 out of 8 processors

Duplicate document


### Fast reproduction of problems

Virtual machine and container operation are the same

In the environment of VMware, you can download the virtual machine files on this page for testing http://www.dyfluid.com/docs/install.html

#### Container environment

Download openfoam/openfoam8-paraview56 image in docker environment

Start using the docker compose script below

- Map the same folder

- Please ignore the port mapping here and use it as an external SSH
$ cat docker-compose.yml
version: "3.1"
services:
  node1:
    image: openfoam/openfoam8-paraview56:8
    container_name: openfoam1
    volumes:
      - ./data/test:/home/openfoam/test
    ports:
      - "2201:22"
    tty: true
  node2:
    image: openfoam/openfoam8-paraview56:8
    container_name: openfoam2
    volumes:
      - ./data/test:/home/openfoam/test
    ports:
      - "2202:2022"
    tty: true

View container IP

docker inspect node1
.....
.....
            "Networks": {
                "bridge": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "d36f5cfb62d2522e518e70609157701691b4b2f29c2231879d6952b8eb5d5695",
                    "EndpointID": "0124939b444476d9eb2a8a3f71056fc6e6384456ec713f67fa794ea7d1736091",
                    "Gateway": "172.17.0.1",
                    "IPAddress": "172.17.0.2",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:ac:11:00:02",
                    "DriverOpts": null
                }
            }
        }
    }
]

Networks >> IPAddress

Enter the container

docker exec -it node1 /bin/bash

Modify container hosts

Add IP address of node1 and node2

Create a new test folder and copy '/opt/openfoam8/tutorials/incompressible/icofoam/cavity/cavity' to the test folder

You need to add hosts

Create machines file

$ cat machines 
node1
node2

Create a systeam/decomposepardict file

root@164590017eae:~/test# cat system/decomposeParDict 
/*--------------------------------*- C++ -*----------------------------------*\
  =========                 |
  \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox
   \\    /   O peration     | Website:  https://openfoam.org
    \\  /    A nd           | Version:  8
     \\/     M anipulation  |
\*---------------------------------------------------------------------------*/
FoamFile
{
    version     2.0;
    format      ascii;
    class       dictionary;
    location    "system";
    object      decomposeParDict;
}
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

numberOfSubdomains 2;

method          scotch;

simpleCoeffs
{
    n               (2 1 1);
    delta           0.001;
}

hierarchicalCoeffs
{
    n               (1 1 1);
    delta           0.001;
    order           xyz;
}

manualCoeffs
{
    dataFile        "";
}

distributed     no;

roots           ( );

// ************************************************************************* //

Create cleanup script allclean

$ cat Allclean
#!/bin/sh
cd ${0%/*} || exit 1    # Run from this directory

# Source tutorial clean functions
. $WM_PROJECT_DIR/bin/tools/CleanFunctions

cleanCase
rm -rf constant/polyMesh 
#------------------------------------------------------------------------------

Testing openmpi

root@164590017eae:~/test# mpirun --allow-run-as-root \
>   --hostfile  machines \
>   --display-map -n 2 -npernode 1 \
>   sh -c 'echo $(hostname):hello'
 Data for JOB [62620,1] offset 0

 ========================   JOB MAP   ========================

 Data for node: 164590017eae    Num slots: 1    Max slots: 0    Num procs: 1
        Process OMPI jobid: [62620,1] App: 0 Process rank: 0 Bound: socket 0[core 0[hwt 0]]:[B][.]

 Data for node: 192.168.48.3    Num slots: 1    Max slots: 0    Num procs: 1
        Process OMPI jobid: [62620,1] App: 0 Process rank: 1 Bound: socket 0[core 0[hwt 0]]:[B][.]

 =============================================================
164590017eae:hello
00edfbbdc56e:hello
root@164590017eae:~/test# mpirun --allow-run-as-root -np 2 -hostfile machines uptime 
 03:38:00 up 1 day,  2:49,  0 users,  load average: 0.06, 0.04, 0.00
 03:38:00 up 1 day,  2:49,  0 users,  load average: 0.06, 0.04, 0.00

View icofoam

root@164590017eae:~/test# which icoFoam
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam

If there are no problems above, start the test

Enter the test folder and execute the following command

Meshing

blockMesh

Regionalization

decomposePar

Running single node parallel computing

mpirun -np 2 icoFoam -parallel > log

If there is no problem with single node parallel computing, run the cleanup script

./Allclean

Run multi node multi-core computing (Cluster Parallel Computing)

mpirun --hostfile machines -np 2 icoFoam -parallel > log

Here - np 2 is the same as numberofsubdomains 2 in the systeam/decomposepardict file; The numerical value is the same, which means to use several CPUs to calculate;

Because the computing framework will give priority to the local CPU, if this value is less than the local CPU, the CPU of other nodes will not be used, and the calculation can be carried out normally.

Therefore, this value must be greater than the number of CPU cores of the machine to repeat the error;

According to the above documents, the problem can be reproduced quickly.

I just want to use openfoam for cluster parallel computing, but I can't solve this problem. I don't know where the problem is. I have spent a month on this problem, using container environment, virtual machine environment, physical machine environment, root and non root users to test. For the same problem, it is normal to use openfoam and openmpi separately. If they are used together for multi node parallel computing, an error will be reported, indicating that the executable file on other nodes cannot be found.

I need help, thank you very much!

ggouaillardet commented 3 years ago

Can you ssh node2 and then /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam ? Is this file on a parallel filesystem? You also need to double check the permissions (e.g. executable) and the dependencies (ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam)

qingfengfenga commented 3 years ago

Can you ssh node2 and then /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam ? Is this file on a parallel filesystem? You also need to double check the permissions (e.g. executable) and the dependencies (ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam)

I executed ssh node2 and then /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam, It was executed successfully and the results were output, and of course, it was just a single core

Currently, node1 and node2 only share test files through NFS. Openfoam and openmpi are not shared, but the user and path structure are completely consistent. Do I need to share openfoam and openmpi environment?

node1$ ssh node2
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-42-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

 * Canonical Livepatch is available for installation.
   - Reduce system reboots and improve kernel security. Activate at:
     https://ubuntu.com/livepatch

7 updates can be applied immediately.
1 of these updates is a standard security update.
To see these additional updates run: apt list --upgradable

New release '20.04.2 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Your Hardware Enablement Stack (HWE) is supported until April 2023.
*** System restart required ***
Last login: Wed Jun  9 13:35:33 2021 from 192.168.91.254
$ cd /tmp/penn/cavity/
/tmp/penn/cavity$ ls -l /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
-rwxr-xr-x 1 root root 736536 3月  16 20:55 /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam

/tmp/penn/cavity# /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
/*---------------------------------------------------------------------------*\
  =========                 |
  \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox
   \\    /   O peration     | Website:  https://openfoam.org
    \\  /    A nd           | Version:  8
     \\/     M anipulation  |
\*---------------------------------------------------------------------------*/
Build  : 8-1c9b5879390b
Exec   : /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
Date   : Jun 09 2021
Time   : 13:37:39
Host   : "dt-PowerEdge-T330"
PID    : 1679
I/O    : uncollated
Case   : /tmp/penn/cavity
nProcs : 1
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10)
allowSystemOperations : Allowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 0

Reading transportProperties

Reading field p

Reading field U

Reading/calculating face flux field phi

Starting time loop

Time = 0.005

Courant Number mean: 0 max: 0
smoothSolver:  Solving for Ux, Initial residual = 1, Final residual = 8.90511e-06, No Iterations 19
smoothSolver:  Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
DICPCG:  Solving for p, Initial residual = 1, Final residual = 0.0492854, No Iterations 12
time step continuity errors : sum local = 0.000466513, global = -1.79995e-19, cumulative = -1.79995e-19
DICPCG:  Solving for p, Initial residual = 0.590864, Final residual = 2.65225e-07, No Iterations 35
time step continuity errors : sum local = 2.74685e-09, global = -2.6445e-19, cumulative = -4.44444e-19
ExecutionTime = 0.01 s  ClockTime = 0 s
......
......

Time = 0.5

Courant Number mean: 0.222158 max: 0.852134
smoothSolver:  Solving for Ux, Initial residual = 2.3091e-07, Final residual = 2.3091e-07, No Iterations 0
smoothSolver:  Solving for Uy, Initial residual = 5.0684e-07, Final residual = 5.0684e-07, No Iterations 0
DICPCG:  Solving for p, Initial residual = 8.63844e-07, Final residual = 8.63844e-07, No Iterations 0
time step continuity errors : sum local = 8.8828e-09, global = 5.49744e-19, cumulative = 3.84189e-19
DICPCG:  Solving for p, Initial residual = 9.59103e-07, Final residual = 9.59103e-07, No Iterations 0
time step continuity errors : sum local = 9.66354e-09, global = -1.28048e-19, cumulative = 2.56141e-19
ExecutionTime = 0.16 s  ClockTime = 1 s

End

node2$ ls
0    0.2  0.4  Allclean  machines    processor1  processor3  processor5  processor7
0.1  0.3  0.5  constant  processor0  processor2  processor4  processor6  system

Dependencies

node1

# ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
    linux-vdso.so.1 (0x00007ffec3d70000)
    /lib/$LIB/liblsp.so => /lib/lib/x86_64-linux-gnu/liblsp.so (0x00007faad5520000)
    libfiniteVolume.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfiniteVolume.so (0x00007faad36c4000)
    libmeshTools.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libmeshTools.so (0x00007faad2fd6000)
    libOpenFOAM.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so (0x00007faad2421000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faad221d000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007faad1e94000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faad1af6000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007faad18de000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faad14ed000)
    libPstream.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system/libPstream.so (0x00007faad12dd000)
    libtriSurface.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libtriSurface.so (0x00007faad103a000)
    libsurfMesh.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libsurfMesh.so (0x00007faad0d31000)
    libfileFormats.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfileFormats.so (0x00007faad0a8f000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007faad0872000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faad0653000)
    /lib64/ld-linux-x86-64.so.2 (0x00007faad59b4000)
    libmpi.so.20 => /usr/lib/x86_64-linux-gnu/libmpi.so.20 (0x00007faad0361000)
    libopen-rte.so.20 => /usr/lib/x86_64-linux-gnu/libopen-rte.so.20 (0x00007faad00d9000)
    libopen-pal.so.20 => /usr/lib/x86_64-linux-gnu/libopen-pal.so.20 (0x00007faacfe27000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007faacfc1f000)
    libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007faacf9e2000)
    libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007faacf7df000)
    libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007faacf5d4000)
    libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007faacf3ca000)

node2

# ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
    linux-vdso.so.1 (0x00007ffd54de1000)
    libfiniteVolume.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfiniteVolume.so (0x00007fbf24aea000)
    libmeshTools.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libmeshTools.so (0x00007fbf243fc000)
    libOpenFOAM.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so (0x00007fbf23847000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fbf23643000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fbf232ba000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fbf22f1c000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fbf22d04000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fbf22913000)
    libPstream.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system/libPstream.so (0x00007fbf22703000)
    libtriSurface.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libtriSurface.so (0x00007fbf22460000)
    libsurfMesh.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libsurfMesh.so (0x00007fbf22157000)
    libfileFormats.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfileFormats.so (0x00007fbf21eb5000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fbf21c98000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fbf21a79000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fbf26bd5000)
    libmpi.so.20 => /usr/lib/x86_64-linux-gnu/libmpi.so.20 (0x00007fbf21787000)
    libopen-rte.so.20 => /usr/lib/x86_64-linux-gnu/libopen-rte.so.20 (0x00007fbf214ff000)
    libopen-pal.so.20 => /usr/lib/x86_64-linux-gnu/libopen-pal.so.20 (0x00007fbf2124d000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fbf21045000)
    libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007fbf20e08000)
    libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fbf20c05000)
    libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007fbf209fa000)
    libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007fbf207f0000)

There is also a URL test result that needs to be supplemented

node1

$ mpirun --allow-run-as-root -report-uri - --hostfile machines -np 8 icoFoam -parallel
909639680.0;usock;tcp://192.168.90.28,172.17.0.1:46535

node2

$ mpirun --allow-run-as-root -report-uri - --hostfile machines -np 8 icoFoam -parallel
4222615552.0;usock;tcp://192.168.90.41:39927

Is openmpi normal so far? For openmpi + openfoam cluster parallel computing, I can't find any detailed documents to guide me to configure correctly except for the official documents that can be read at a glance.

ggouaillardet commented 3 years ago

You do not have to put everything on NFS. But you will find some issues if you try to MPI-IO on a local filesystem. As long as your working directory and the input/output files are on a shared filesystem, you should be fine.

I noted the dependency on liblsp.so is only on node1 (I do not think that should be an issue though)

From the shared filesystem, have you tried to

mpirun --allow-run-as-root --hostfile machines -np 8  /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
qingfengfenga commented 3 years ago

You do not have to put everything on NFS. But you will find some issues if you try to MPI-IO on a local filesystem. As long as your working directory and the input/output files are on a shared filesystem, you should be fine.

I noted the dependency on liblsp.so is only on node1 (I do not think that should be an issue though)

From the shared filesystem, have you tried to

mpirun --allow-run-as-root --hostfile machines -np 8  /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel

As you can see, there are still four cores that cannot find icofoam files on other nodes. Is there any variables in openmpi that are related to this to specify the executable on other nodes?

$ mpirun --allow-run-as-root --hostfile machines -np 8  /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
[digitwin-System-Product-Name:16888] [[14219,0],0] usock_peer_send_blocking: send() to socket 50 failed: Broken pipe (32)
[digitwin-System-Product-Name:16888] [[14219,0],0] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 316
[digitwin-System-Product-Name:16888] [[14219,0],0]-[[14219,1],5] usock_peer_accept: usock_peer_send_connect_ack failed
[digitwin-System-Product-Name:16888] [[14219,0],0] usock_peer_send_blocking: send() to socket 51 failed: Broken pipe (32)
[digitwin-System-Product-Name:16888] [[14219,0],0] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 316
[digitwin-System-Product-Name:16888] [[14219,0],0]-[[14219,1],7] usock_peer_accept: usock_peer_send_connect_ack failed
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[14219,1],0]
  Exit code:    127
--------------------------------------------------------------------------
$ ls /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam 
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
ggouaillardet commented 3 years ago

I did not see much things in these kanjis ...

Anyway, now is a different issue: the binaries are found on all the nodes, but some dependencies are not. I guess you have /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib in your LD_LIBRARY_PATH and it does not exported to node2.

What if you

mpirun --allow-run-as-root --hostfile machines -np 8 -x LD_LIBRARY_PATH /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
qingfengfenga commented 3 years ago

Sorry, I pasted the wrong text, I have made an update.

I did not see much things in these kanjis ...

Anyway, now is a different issue: the binaries are found on all the nodes, but some dependencies are not. I guess you have /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib in your LD_LIBRARY_PATH and it does not exported to node2.

What if you

mpirun --allow-run-as-root --hostfile machines -np 8 -x LD_LIBRARY_PATH /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel

node1

$ echo $LD_LIBRARY_PATH
/opt/ThirdParty-8/platforms/linux64Gcc/gperftools-svn/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/paraview-5.6:/opt/paraviewopenfoam56/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system:/opt/ThirdParty-8/platforms/linux64GccDPInt32/lib/openmpi-system:/usr/lib/x86_64-linux-gnu/openmpi/lib:/root/OpenFOAM/root-8/platforms/linux64GccDPInt32Opt/lib:/opt/site/8/platforms/linux64GccDPInt32Opt/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib:/opt/ThirdParty-8/platforms/linux64GccDPInt32/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/dummy

node2

$ echo $LD_LIBRARY_PATH
/opt/ThirdParty-8/platforms/linux64Gcc/gperftools-svn/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/paraview-5.6:/opt/paraviewopenfoam56/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system:/opt/ThirdParty-8/platforms/linux64GccDPInt32/lib/openmpi-system:/usr/lib/x86_64-linux-gnu/openmpi/lib:/root/OpenFOAM/root-8/platforms/linux64GccDPInt32Opt/lib:/opt/site/8/platforms/linux64GccDPInt32Opt/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib:/opt/ThirdParty-8/platforms/linux64GccDPInt32/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/dummy

LD LIBRARY Path seems to be configured correctly,but it's still wrong

This is the result of implementation

$ mpirun --allow-run-as-root --hostfile machines -np 8 -x LD_LIBRARY_PATH /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
--> FOAM FATAL ERROR in Foam::findEtcFiles() : could not find mandatory file
    'controlDict'

--> FOAM FATAL ERROR in Foam::findEtcFiles() : could not find mandatory file
    'controlDict'

-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--> FOAM FATAL ERROR in Foam::findEtcFiles() : could not find mandatory file
    'controlDict'

--> FOAM FATAL ERROR in Foam::findEtcFiles() : could not find mandatory file
    'controlDict'

--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[13775,1],0]
  Exit code:    1
--------------------------------------------------------------------------
ggouaillardet commented 3 years ago

LD_LIBRARY_PATH is correctly set when you ssh node2. If you want to put yourself in Open MPI shoes, you can

ssh node2  ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam

and see how it goes.

Anyway, as far as Open MPI is concerned, the problem is now fixed.

The remaining issue is specific to OpenFOAM (controlDict not found) and it is up to you to fix it.

A few questions you must ask yourself (and answer ...) Is it on the shared filesystem? Did you forgot to copy it? Do you need to propagate more environment variables?

qingfengfenga commented 3 years ago

LD_LIBRARY_PATH is correctly set when you ssh node2. If you want to put yourself in Open MPI shoes, you can

ssh node2  ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam

and see how it goes.

Anyway, as far as Open MPI is concerned, the problem is now fixed.

The remaining issue is specific to OpenFOAM (controlDict not found) and it is up to you to fix it.

A few questions you must ask yourself (and answer ...) Is it on the shared filesystem? Did you forgot to copy it? Do you need to propagate more environment variables?

As you can see, the file exists. I may need to check the variable configuration of openfoam to solve the problem of not finding controldict.

node1$ ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
    linux-vdso.so.1 (0x00007fff34b9c000)
    /lib/$LIB/liblsp.so => /lib/lib/x86_64-linux-gnu/liblsp.so (0x00007fc83b64d000)
    libfiniteVolume.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfiniteVolume.so (0x00007fc8397f1000)
    libmeshTools.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libmeshTools.so (0x00007fc839103000)
    libOpenFOAM.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so (0x00007fc83854e000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fc83834a000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fc837fc1000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fc837c23000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fc837a0b000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc83761a000)
    libPstream.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system/libPstream.so (0x00007fc83740a000)
    libtriSurface.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libtriSurface.so (0x00007fc837167000)
    libsurfMesh.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libsurfMesh.so (0x00007fc836e5e000)
    libfileFormats.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfileFormats.so (0x00007fc836bbc000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fc83699f000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc836780000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fc83bae1000)
    libmpi.so.20 => /usr/lib/x86_64-linux-gnu/libmpi.so.20 (0x00007fc83648e000)
    libopen-rte.so.20 => /usr/lib/x86_64-linux-gnu/libopen-rte.so.20 (0x00007fc836206000)
    libopen-pal.so.20 => /usr/lib/x86_64-linux-gnu/libopen-pal.so.20 (0x00007fc835f54000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fc835d4c000)
    libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007fc835b0f000)
    libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fc83590c000)
    libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007fc835701000)
    libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007fc8354f7000)

node1$  ssh node2  ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
    linux-vdso.so.1 (0x00007ffc45efc000)
    libfiniteVolume.so => not found
    libmeshTools.so => not found
    libOpenFOAM.so => not found
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f098a107000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f0989d7e000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f09899e0000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f09897c8000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f09893d7000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f098a59a000)
    libPstream.so => not found

$ ls system/controlDict 
system/controlDict
$ ls -l system/controlDict 
-rw-r--r-- 1 root root 1045 6月   8 18:14 system/controlDict

Thank you very much, I will continue to eliminate the problem, and after the problem is solved, release a complete document and error checking method to the people in need, thank you again for your answer!

jsquyres commented 3 years ago

FWIW: @ggouaillardet hit the nail on the head. You want to check LD_LIBRARY_PATH for non-interactive logins:

ssh node2 ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam

Sometimes .bashrc (or other shell startup files) do different things between interactive logins and non-interactive logins. For example:

# This command:
node1$ ssh node2 ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam

# May give different results than this:
node1$ ssh node2
node2$ ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
qingfengfenga commented 3 years ago

The problem has been solved. When openmpi runs the command, it needs to add network parameters.

mpirun --mca btl_tcp_if_include 192.168.x.x/24
javierpvm commented 2 years ago

I want to model a hydraulic channel in ubuntu 20.04lts through OpenFoam and from what I see I have the same problem I would like to know exactly what network parameters you mean qingfengfenga and the exact location to configure it....... I have it configured 1 master and 2 nodes; I use a private ip: 10.17.38.30, subnet mask: 255.255.255.0, default gateway: 10.17.38.254. DNS: 10.16.30.2, in the same way for the two physical nodes ip 10.17.38.31, 10.17.38.32 and the rest the same I introduce in my exports file the following /home/cluster/OpenFoam 10.17.38.0/24(rw, no_sbtree_chech,async,no_root_squash) If you could help me with these parameters, they are correct, I would be infinitely grateful, or if you could provide me with a guide CLUSTER IN UBUNTU 20.04LTS or higher, I tried many guides and I don't know what to do anymore, but I don't want to give up either, thanks

qingfengfenga commented 2 years ago

@javierpvm

Don't give up, you are so close to the truth, this video may help you

https://youtu.be/tDZ4gAKSG6c