Closed qingfengfenga closed 3 years ago
Can you ssh node2
and then /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
?
Is this file on a parallel filesystem?
You also need to double check the permissions (e.g. executable) and the dependencies (ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
)
Can you
ssh node2
and then/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
? Is this file on a parallel filesystem? You also need to double check the permissions (e.g. executable) and the dependencies (ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
)
I executed ssh node2
and then /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
, It was executed successfully and the results were output, and of course, it was just a single core
Currently, node1 and node2 only share test files through NFS. Openfoam and openmpi are not shared, but the user and path structure are completely consistent. Do I need to share openfoam and openmpi environment?
node1$ ssh node2
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-42-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
* Canonical Livepatch is available for installation.
- Reduce system reboots and improve kernel security. Activate at:
https://ubuntu.com/livepatch
7 updates can be applied immediately.
1 of these updates is a standard security update.
To see these additional updates run: apt list --upgradable
New release '20.04.2 LTS' available.
Run 'do-release-upgrade' to upgrade to it.
Your Hardware Enablement Stack (HWE) is supported until April 2023.
*** System restart required ***
Last login: Wed Jun 9 13:35:33 2021 from 192.168.91.254
$ cd /tmp/penn/cavity/
/tmp/penn/cavity$ ls -l /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
-rwxr-xr-x 1 root root 736536 3月 16 20:55 /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
/tmp/penn/cavity# /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
/*---------------------------------------------------------------------------*\
========= |
\\ / F ield | OpenFOAM: The Open Source CFD Toolbox
\\ / O peration | Website: https://openfoam.org
\\ / A nd | Version: 8
\\/ M anipulation |
\*---------------------------------------------------------------------------*/
Build : 8-1c9b5879390b
Exec : /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
Date : Jun 09 2021
Time : 13:37:39
Host : "dt-PowerEdge-T330"
PID : 1679
I/O : uncollated
Case : /tmp/penn/cavity
nProcs : 1
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10)
allowSystemOperations : Allowing user-supplied system call operations
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time
Create mesh for time = 0
Reading transportProperties
Reading field p
Reading field U
Reading/calculating face flux field phi
Starting time loop
Time = 0.005
Courant Number mean: 0 max: 0
smoothSolver: Solving for Ux, Initial residual = 1, Final residual = 8.90511e-06, No Iterations 19
smoothSolver: Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
DICPCG: Solving for p, Initial residual = 1, Final residual = 0.0492854, No Iterations 12
time step continuity errors : sum local = 0.000466513, global = -1.79995e-19, cumulative = -1.79995e-19
DICPCG: Solving for p, Initial residual = 0.590864, Final residual = 2.65225e-07, No Iterations 35
time step continuity errors : sum local = 2.74685e-09, global = -2.6445e-19, cumulative = -4.44444e-19
ExecutionTime = 0.01 s ClockTime = 0 s
......
......
Time = 0.5
Courant Number mean: 0.222158 max: 0.852134
smoothSolver: Solving for Ux, Initial residual = 2.3091e-07, Final residual = 2.3091e-07, No Iterations 0
smoothSolver: Solving for Uy, Initial residual = 5.0684e-07, Final residual = 5.0684e-07, No Iterations 0
DICPCG: Solving for p, Initial residual = 8.63844e-07, Final residual = 8.63844e-07, No Iterations 0
time step continuity errors : sum local = 8.8828e-09, global = 5.49744e-19, cumulative = 3.84189e-19
DICPCG: Solving for p, Initial residual = 9.59103e-07, Final residual = 9.59103e-07, No Iterations 0
time step continuity errors : sum local = 9.66354e-09, global = -1.28048e-19, cumulative = 2.56141e-19
ExecutionTime = 0.16 s ClockTime = 1 s
End
node2$ ls
0 0.2 0.4 Allclean machines processor1 processor3 processor5 processor7
0.1 0.3 0.5 constant processor0 processor2 processor4 processor6 system
node1
# ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
linux-vdso.so.1 (0x00007ffec3d70000)
/lib/$LIB/liblsp.so => /lib/lib/x86_64-linux-gnu/liblsp.so (0x00007faad5520000)
libfiniteVolume.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfiniteVolume.so (0x00007faad36c4000)
libmeshTools.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libmeshTools.so (0x00007faad2fd6000)
libOpenFOAM.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so (0x00007faad2421000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faad221d000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007faad1e94000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faad1af6000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007faad18de000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faad14ed000)
libPstream.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system/libPstream.so (0x00007faad12dd000)
libtriSurface.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libtriSurface.so (0x00007faad103a000)
libsurfMesh.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libsurfMesh.so (0x00007faad0d31000)
libfileFormats.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfileFormats.so (0x00007faad0a8f000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007faad0872000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faad0653000)
/lib64/ld-linux-x86-64.so.2 (0x00007faad59b4000)
libmpi.so.20 => /usr/lib/x86_64-linux-gnu/libmpi.so.20 (0x00007faad0361000)
libopen-rte.so.20 => /usr/lib/x86_64-linux-gnu/libopen-rte.so.20 (0x00007faad00d9000)
libopen-pal.so.20 => /usr/lib/x86_64-linux-gnu/libopen-pal.so.20 (0x00007faacfe27000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007faacfc1f000)
libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007faacf9e2000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007faacf7df000)
libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007faacf5d4000)
libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007faacf3ca000)
node2
# ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
linux-vdso.so.1 (0x00007ffd54de1000)
libfiniteVolume.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfiniteVolume.so (0x00007fbf24aea000)
libmeshTools.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libmeshTools.so (0x00007fbf243fc000)
libOpenFOAM.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so (0x00007fbf23847000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fbf23643000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fbf232ba000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fbf22f1c000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fbf22d04000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fbf22913000)
libPstream.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system/libPstream.so (0x00007fbf22703000)
libtriSurface.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libtriSurface.so (0x00007fbf22460000)
libsurfMesh.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libsurfMesh.so (0x00007fbf22157000)
libfileFormats.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfileFormats.so (0x00007fbf21eb5000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fbf21c98000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fbf21a79000)
/lib64/ld-linux-x86-64.so.2 (0x00007fbf26bd5000)
libmpi.so.20 => /usr/lib/x86_64-linux-gnu/libmpi.so.20 (0x00007fbf21787000)
libopen-rte.so.20 => /usr/lib/x86_64-linux-gnu/libopen-rte.so.20 (0x00007fbf214ff000)
libopen-pal.so.20 => /usr/lib/x86_64-linux-gnu/libopen-pal.so.20 (0x00007fbf2124d000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fbf21045000)
libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007fbf20e08000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fbf20c05000)
libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007fbf209fa000)
libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007fbf207f0000)
node1
$ mpirun --allow-run-as-root -report-uri - --hostfile machines -np 8 icoFoam -parallel
909639680.0;usock;tcp://192.168.90.28,172.17.0.1:46535
node2
$ mpirun --allow-run-as-root -report-uri - --hostfile machines -np 8 icoFoam -parallel
4222615552.0;usock;tcp://192.168.90.41:39927
Is openmpi normal so far? For openmpi + openfoam cluster parallel computing, I can't find any detailed documents to guide me to configure correctly except for the official documents that can be read at a glance.
You do not have to put everything on NFS. But you will find some issues if you try to MPI-IO
on a local filesystem.
As long as your working directory and the input/output files are on a shared filesystem, you should be fine.
I noted the dependency on liblsp.so
is only on node1
(I do not think that should be an issue though)
From the shared filesystem, have you tried to
mpirun --allow-run-as-root --hostfile machines -np 8 /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
You do not have to put everything on NFS. But you will find some issues if you try to
MPI-IO
on a local filesystem. As long as your working directory and the input/output files are on a shared filesystem, you should be fine.I noted the dependency on
liblsp.so
is only onnode1
(I do not think that should be an issue though)From the shared filesystem, have you tried to
mpirun --allow-run-as-root --hostfile machines -np 8 /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
As you can see, there are still four cores that cannot find icofoam files on other nodes. Is there any variables in openmpi that are related to this to specify the executable on other nodes?
$ mpirun --allow-run-as-root --hostfile machines -np 8 /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
[digitwin-System-Product-Name:16888] [[14219,0],0] usock_peer_send_blocking: send() to socket 50 failed: Broken pipe (32)
[digitwin-System-Product-Name:16888] [[14219,0],0] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 316
[digitwin-System-Product-Name:16888] [[14219,0],0]-[[14219,1],5] usock_peer_accept: usock_peer_send_connect_ack failed
[digitwin-System-Product-Name:16888] [[14219,0],0] usock_peer_send_blocking: send() to socket 51 failed: Broken pipe (32)
[digitwin-System-Product-Name:16888] [[14219,0],0] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 316
[digitwin-System-Product-Name:16888] [[14219,0],0]-[[14219,1],7] usock_peer_accept: usock_peer_send_connect_ack failed
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[14219,1],0]
Exit code: 127
--------------------------------------------------------------------------
$ ls /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
/opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
I did not see much things in these kanjis ...
Anyway, now is a different issue: the binaries are found on all the nodes, but some dependencies are not.
I guess you have /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib
in your LD_LIBRARY_PATH
and it does not exported to node2
.
What if you
mpirun --allow-run-as-root --hostfile machines -np 8 -x LD_LIBRARY_PATH /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
Sorry, I pasted the wrong text, I have made an update.
I did not see much things in these kanjis ...
Anyway, now is a different issue: the binaries are found on all the nodes, but some dependencies are not. I guess you have
/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib
in yourLD_LIBRARY_PATH
and it does not exported tonode2
.What if you
mpirun --allow-run-as-root --hostfile machines -np 8 -x LD_LIBRARY_PATH /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
node1
$ echo $LD_LIBRARY_PATH
/opt/ThirdParty-8/platforms/linux64Gcc/gperftools-svn/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/paraview-5.6:/opt/paraviewopenfoam56/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system:/opt/ThirdParty-8/platforms/linux64GccDPInt32/lib/openmpi-system:/usr/lib/x86_64-linux-gnu/openmpi/lib:/root/OpenFOAM/root-8/platforms/linux64GccDPInt32Opt/lib:/opt/site/8/platforms/linux64GccDPInt32Opt/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib:/opt/ThirdParty-8/platforms/linux64GccDPInt32/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/dummy
node2
$ echo $LD_LIBRARY_PATH
/opt/ThirdParty-8/platforms/linux64Gcc/gperftools-svn/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/paraview-5.6:/opt/paraviewopenfoam56/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system:/opt/ThirdParty-8/platforms/linux64GccDPInt32/lib/openmpi-system:/usr/lib/x86_64-linux-gnu/openmpi/lib:/root/OpenFOAM/root-8/platforms/linux64GccDPInt32Opt/lib:/opt/site/8/platforms/linux64GccDPInt32Opt/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib:/opt/ThirdParty-8/platforms/linux64GccDPInt32/lib:/opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/dummy
LD LIBRARY Path seems to be configured correctly,but it's still wrong
This is the result of implementation
$ mpirun --allow-run-as-root --hostfile machines -np 8 -x LD_LIBRARY_PATH /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam -parallel
--> FOAM FATAL ERROR in Foam::findEtcFiles() : could not find mandatory file
'controlDict'
--> FOAM FATAL ERROR in Foam::findEtcFiles() : could not find mandatory file
'controlDict'
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--> FOAM FATAL ERROR in Foam::findEtcFiles() : could not find mandatory file
'controlDict'
--> FOAM FATAL ERROR in Foam::findEtcFiles() : could not find mandatory file
'controlDict'
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[13775,1],0]
Exit code: 1
--------------------------------------------------------------------------
LD_LIBRARY_PATH
is correctly set when you ssh node2
.
If you want to put yourself in Open MPI shoes, you can
ssh node2 ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
and see how it goes.
Anyway, as far as Open MPI is concerned, the problem is now fixed.
The remaining issue is specific to OpenFOAM (controlDict not found) and it is up to you to fix it.
A few questions you must ask yourself (and answer ...) Is it on the shared filesystem? Did you forgot to copy it? Do you need to propagate more environment variables?
LD_LIBRARY_PATH
is correctly set when youssh node2
. If you want to put yourself in Open MPI shoes, you canssh node2 ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
and see how it goes.
Anyway, as far as Open MPI is concerned, the problem is now fixed.
The remaining issue is specific to OpenFOAM (controlDict not found) and it is up to you to fix it.
A few questions you must ask yourself (and answer ...) Is it on the shared filesystem? Did you forgot to copy it? Do you need to propagate more environment variables?
As you can see, the file exists. I may need to check the variable configuration of openfoam to solve the problem of not finding controldict.
node1$ ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
linux-vdso.so.1 (0x00007fff34b9c000)
/lib/$LIB/liblsp.so => /lib/lib/x86_64-linux-gnu/liblsp.so (0x00007fc83b64d000)
libfiniteVolume.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfiniteVolume.so (0x00007fc8397f1000)
libmeshTools.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libmeshTools.so (0x00007fc839103000)
libOpenFOAM.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libOpenFOAM.so (0x00007fc83854e000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fc83834a000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fc837fc1000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fc837c23000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fc837a0b000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc83761a000)
libPstream.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/openmpi-system/libPstream.so (0x00007fc83740a000)
libtriSurface.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libtriSurface.so (0x00007fc837167000)
libsurfMesh.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libsurfMesh.so (0x00007fc836e5e000)
libfileFormats.so => /opt/openfoam8/platforms/linux64GccDPInt32Opt/lib/libfileFormats.so (0x00007fc836bbc000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fc83699f000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc836780000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc83bae1000)
libmpi.so.20 => /usr/lib/x86_64-linux-gnu/libmpi.so.20 (0x00007fc83648e000)
libopen-rte.so.20 => /usr/lib/x86_64-linux-gnu/libopen-rte.so.20 (0x00007fc836206000)
libopen-pal.so.20 => /usr/lib/x86_64-linux-gnu/libopen-pal.so.20 (0x00007fc835f54000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fc835d4c000)
libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007fc835b0f000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fc83590c000)
libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007fc835701000)
libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007fc8354f7000)
node1$ ssh node2 ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
linux-vdso.so.1 (0x00007ffc45efc000)
libfiniteVolume.so => not found
libmeshTools.so => not found
libOpenFOAM.so => not found
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f098a107000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f0989d7e000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f09899e0000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f09897c8000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f09893d7000)
/lib64/ld-linux-x86-64.so.2 (0x00007f098a59a000)
libPstream.so => not found
$ ls system/controlDict
system/controlDict
$ ls -l system/controlDict
-rw-r--r-- 1 root root 1045 6月 8 18:14 system/controlDict
Thank you very much, I will continue to eliminate the problem, and after the problem is solved, release a complete document and error checking method to the people in need, thank you again for your answer!
FWIW: @ggouaillardet hit the nail on the head. You want to check LD_LIBRARY_PATH
for non-interactive logins:
ssh node2 ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
Sometimes .bashrc
(or other shell startup files) do different things between interactive logins and non-interactive logins. For example:
# This command:
node1$ ssh node2 ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
# May give different results than this:
node1$ ssh node2
node2$ ldd /opt/openfoam8/platforms/linux64GccDPInt32Opt/bin/icoFoam
The problem has been solved. When openmpi runs the command, it needs to add network parameters.
mpirun --mca btl_tcp_if_include 192.168.x.x/24
I want to model a hydraulic channel in ubuntu 20.04lts through OpenFoam and from what I see I have the same problem I would like to know exactly what network parameters you mean qingfengfenga and the exact location to configure it....... I have it configured 1 master and 2 nodes; I use a private ip: 10.17.38.30, subnet mask: 255.255.255.0, default gateway: 10.17.38.254. DNS: 10.16.30.2, in the same way for the two physical nodes ip 10.17.38.31, 10.17.38.32 and the rest the same I introduce in my exports file the following /home/cluster/OpenFoam 10.17.38.0/24(rw, no_sbtree_chech,async,no_root_squash) If you could help me with these parameters, they are correct, I would be infinitely grateful, or if you could provide me with a guide CLUSTER IN UBUNTU 20.04LTS or higher, I tried many guides and I don't know what to do anymore, but I don't want to give up either, thanks
@javierpvm
Don't give up, you are so close to the truth, this video may help you
I use two nodes to do parallel computing test of openfoam cluster
SSH password free and shared folder have been configured
I've seen this problem, but it doesn't help me #6293
Background information
OpenMPI version
node1
How to install it
node1/node2
Openmpi installed by source / distribution
On what system
node1
node2
OpenMPI Configuration
node1
node2
Details of the problem
The official document is too simple. I sent an email to openfoam.org asking for related questions. They said that they are compatible with openmpi. They have many clusters running normally. Let me check them again. They can't provide substantive help.
Single node computing, single node multi-core parallel computing, are no problem
When I run openfoam multi node parallel computing, it prompts that I can't find the solver executable file of other nodes. When I log in to this machine to check, I find that this file exists, it can run normally, and the path is correct.
When I run multi node parallel computing and the number of cores does not exceed node1, it is normal. In other words, openmpi cannot call the core on the network. But when I run the openmip test, it's normal
OpenMPI test shows that it is normal
Duplicate document
View container IP
Networks >> IPAddress
Enter the container
Modify container hosts
Add IP address of node1 and node2
Configure SSH password free
https://blog.csdn.net/qq_36274515/article/details/94589518
Check iptable and firewall
https://jingyan.baidu.com/article/73c3ce283ee2c1e50343d9f6.html
https://blog.csdn.net/weixin_34080903/article/details/86053017
Create a new test folder and copy '/opt/openfoam8/tutorials/incompressible/icofoam/cavity/cavity' to the test folder
You need to add hosts
Create machines file
Create a systeam/decomposepardict file
Create cleanup script allclean
Testing openmpi
View icofoam
If there are no problems above, start the test
Enter the test folder and execute the following command
Meshing
blockMesh
Regionalization
decomposePar
Running single node parallel computing
mpirun -np 2 icoFoam -parallel > log
If there is no problem with single node parallel computing, run the cleanup script
./Allclean
Run multi node multi-core computing (Cluster Parallel Computing)
mpirun --hostfile machines -np 2 icoFoam -parallel > log
Here - np 2 is the same as
numberofsubdomains
2 in thesysteam/decomposepardict
file; The numerical value is the same, which means to use several CPUs to calculate;Because the computing framework will give priority to the local CPU, if this value is less than the local CPU, the CPU of other nodes will not be used, and the calculation can be carried out normally.
Therefore, this value must be greater than the number of CPU cores of the machine to repeat the error;
According to the above documents, the problem can be reproduced quickly.
I just want to use openfoam for cluster parallel computing, but I can't solve this problem. I don't know where the problem is. I have spent a month on this problem, using container environment, virtual machine environment, physical machine environment, root and non root users to test. For the same problem, it is normal to use openfoam and openmpi separately. If they are used together for multi node parallel computing, an error will be reported, indicating that the executable file on other nodes cannot be found.
I need help, thank you very much!