Closed thw1021 closed 5 years ago
@talbring Could you give me some suggestions, please ? (I don't quite understand mpi and docker, please forgive me for troubling you.)
@thw1021 Like I already said in the comment in the other issue https://github.com/su2code/SU2/issues/738#issuecomment-513870126 : No one can give you support when running OpenMPI in a docker container since it is not officially supported. The only suggestion I have is to use singularity. If you want to test it, install it and you can download su2.sif I created here: https://drive.google.com/open?id=1SaZDloevjj8rFDG2x3Lh05nhTuKHakDK
OK. Thank you very much.
@talbring Really sorry for troubling you again.
I followed your suggestions to install singularity (3.3.0) and use the su2.sif you shared with me. I run mpirun -n 24 su2.sif SU2_CFD inv_NACA0012.cfg
. It failed to work. see the log file.
su2.sif.log
The reason should be the OpenMPI version. But I also have installed openmpi-4.0.1 and add
export PATH=$PATH:$HOME/openmpi/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/openmpi/lib
to my .bashrc
file. But when I run mpirun --version
, the output is
mpirun (Open MPI) 1.10.2
Report bugs to http://www.open-mpi.org/community/help/
The OS on my computer is ubuntu 16.04.
Could you give me some suggestions to solve this problem ? I google for this but failed to find a good way.
Best.
Path is searched in order, so put it like that:
export PATH=$HOME/openmpi/bin:$PATH
export LD_LIBRARY_PATH=$HOME/openmpi/lib:$LD_LIBRARY_PATH
@talbring Yes, it worked. Thank you.
However, no flow.dat
file when it finished. And it seems that I cannot run SU2_SOL
to get flow.dat
file.
I have tried with three ways, all failed. Could you give me some suggestions, please ?
hongwei@hongwei-Workstation:~/SU2_RUN/QuickStart$ mpirun -n 24 su2.sif SU2_SOL inv_NACA0012.cfg
-------------------------------------------------------------------------
| ___ _ _ ___ |
| / __| | | |_ ) Release 6.2.0 "Falcon" |
| \__ \ |_| |/ / |
| |___/\___//___| Suite (Solution Exporting Code) |
| |
-------------------------------------------------------------------------
| The current SU2 release has been coordinated by the |
| SU2 International Developers Society <www.su2devsociety.org> |
| with selected contributions from the open-source community. |
-------------------------------------------------------------------------
| The main research teams contributing to the current release are: |
| - Prof. Juan J. Alonso's group at Stanford University. |
| - Prof. Piero Colonna's group at Delft University of Technology. |
| - Prof. Nicolas R. Gauger's group at Kaiserslautern U. of Technology. |
| - Prof. Alberto Guardone's group at Polytechnic University of Milan. |
| - Prof. Rafael Palacios' group at Imperial College London. |
| - Prof. Vincent Terrapon's group at the University of Liege. |
| - Prof. Edwin van der Weide's group at the University of Twente. |
| - Lab. of New Concepts in Aeronautics at Tech. Inst. of Aeronautics. |
-------------------------------------------------------------------------
| Copyright 2012-2019, Francisco D. Palacios, Thomas D. Economon, |
| Tim Albring, and the SU2 contributors. |
| |
| SU2 is free software; you can redistribute it and/or |
| modify it under the terms of the GNU Lesser General Public |
| License as published by the Free Software Foundation; either |
| version 2.1 of the License, or (at your option) any later version. |
| |
| SU2 is distributed in the hope that it will be useful, |
| but WITHOUT ANY WARRANTY; without even the implied warranty of |
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
| Lesser General Public License for more details. |
| |
| You should have received a copy of the GNU Lesser General Public |
| License along with SU2. If not, see <http://www.gnu.org/licenses/>. |
-------------------------------------------------------------------------
------------------------ Physical Case Definition -----------------------
Input mesh file name: mesh_NACA0012_inv.su2
-------------------------- Output Information ---------------------------
The output file format is Tecplot ASCII (.dat).
Flow variables file name: flow.
------------------- Config File Boundary Information --------------------
+-----------------------------------------+
| Marker Type| Marker Name|
+-----------------------------------------+
| Euler wall| airfoil|
+-----------------------------------------+
| Far-field| farfield|
+-----------------------------------------+
---------------------- Read Grid File Information -----------------------
Two dimensional problem.
5233 points before parallel partitioning.
Performing linear partitioning of the grid nodes.
10216 interior elements before parallel partitioning.
Executing the partitioning functions.
Building the graph adjacency structure.
Distributing elements across all ranks.
2 surface markers.
+------------------------------------+
| Index| Marker| Elements|
+------------------------------------+
| 0| airfoil| 200|
| 1| farfield| 50|
+------------------------------------+
Calling ParMETIS... graph partitioning complete (1114 edge cuts).
Distributing ParMETIS coloring.
Rebalancing vertices.
Rebalancing volume element connectivity.
Rebalancing markers and surface elements.
6403 vertices including ghost points.
11338 interior elements including halo cells.
11338 triangles.
Establishing MPI communication patterns.
Identify vertices.
Storing a mapping from global to local point index.
------------------------- Solution Postprocessing -----------------------
Error in "void CBaselineSolver::SetOutputVariables(CGeometry*, CConfig*)":
-------------------------------------------------------------------------
Unable to open SU2 restart file solution_flow.dat
------------------------------ Error Exit -------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 17 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[hongwei-Workstation:07803] PMIX ERROR: UNREACHABLE in file server/pmix_server.c at line 2079
[hongwei-Workstation:07803] PMIX ERROR: UNREACHABLE in file server/pmix_server.c at line 2079
[hongwei-Workstation:07803] PMIX ERROR: UNREACHABLE in file server/pmix_server.c at line 2079
[hongwei-Workstation:07803] PMIX ERROR: UNREACHABLE in file server/pmix_server.c at line 2079
[hongwei-Workstation:07803] PMIX ERROR: UNREACHABLE in file server/pmix_server.c at line 2079
[hongwei-Workstation:07803] PMIX ERROR: UNREACHABLE in file server/pmix_server.c at line 2079
[hongwei-Workstation:07803] 23 more processes have sent help message help-mpi-api.txt / mpi-abort
[hongwei-Workstation:07803] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
hongwei@hongwei-Workstation:~/SU2_RUN/QuickStart$ singularity exec su2.sif SU2_SOL inv_NACA0012.cfg
/.singularity.d/actions/exec: 9: exec: SU2_SOL: not found
hongwei@hongwei-Workstation:~/SU2_RUN/QuickStart$ singularity shell su2.sif
Singularity su2.sif:~/SU2_RUN/QuickStart> SU2_SOL inv_NACA0012.cfg
bash: SU2_SOL: command not found
Singularity su2.sif:~/SU2_RUN/QuickStart>
I found your previous comments :
%runscript
exec /SU2/bin/$1 $2
So I run singularity exec su2.sif /SU2/bin/SU2_SOL inv_NACA0012.cfg
, but still failed. SU2_CFD
can run successfully in this way. So why ?
hongwei@hongwei-Workstation:~/SU2_RUN/QuickStart$ singularity exec su2.sif /SU2/bin/SU2_SOL inv_NACA0012.cfg
-------------------------------------------------------------------------
| ___ _ _ ___ |
| / __| | | |_ ) Release 6.2.0 "Falcon" |
| \__ \ |_| |/ / |
| |___/\___//___| Suite (Solution Exporting Code) |
| |
-------------------------------------------------------------------------
| The current SU2 release has been coordinated by the |
| SU2 International Developers Society <www.su2devsociety.org> |
| with selected contributions from the open-source community. |
-------------------------------------------------------------------------
| The main research teams contributing to the current release are: |
| - Prof. Juan J. Alonso's group at Stanford University. |
| - Prof. Piero Colonna's group at Delft University of Technology. |
| - Prof. Nicolas R. Gauger's group at Kaiserslautern U. of Technology. |
| - Prof. Alberto Guardone's group at Polytechnic University of Milan. |
| - Prof. Rafael Palacios' group at Imperial College London. |
| - Prof. Vincent Terrapon's group at the University of Liege. |
| - Prof. Edwin van der Weide's group at the University of Twente. |
| - Lab. of New Concepts in Aeronautics at Tech. Inst. of Aeronautics. |
-------------------------------------------------------------------------
| Copyright 2012-2019, Francisco D. Palacios, Thomas D. Economon, |
| Tim Albring, and the SU2 contributors. |
| |
| SU2 is free software; you can redistribute it and/or |
| modify it under the terms of the GNU Lesser General Public |
| License as published by the Free Software Foundation; either |
| version 2.1 of the License, or (at your option) any later version. |
| |
| SU2 is distributed in the hope that it will be useful, |
| but WITHOUT ANY WARRANTY; without even the implied warranty of |
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
| Lesser General Public License for more details. |
| |
| You should have received a copy of the GNU Lesser General Public |
| License along with SU2. If not, see <http://www.gnu.org/licenses/>. |
-------------------------------------------------------------------------
------------------------ Physical Case Definition -----------------------
Input mesh file name: mesh_NACA0012_inv.su2
-------------------------- Output Information ---------------------------
The output file format is Tecplot ASCII (.dat).
Flow variables file name: flow.
------------------- Config File Boundary Information --------------------
+-----------------------------------------+
| Marker Type| Marker Name|
+-----------------------------------------+
| Euler wall| airfoil|
+-----------------------------------------+
| Far-field| farfield|
+-----------------------------------------+
---------------------- Read Grid File Information -----------------------
Two dimensional problem.
5233 points.
2 surface markers.
+------------------------------------+
| Index| Marker| Elements|
+------------------------------------+
| 0| airfoil| 200|
| 1| farfield| 50|
+------------------------------------+
10216 triangles.
Identify vertices.
Storing a mapping from global to local point index.
------------------------- Solution Postprocessing -----------------------
Error in "void CBaselineSolver::SetOutputVariables(CGeometry*, CConfig*)":
-------------------------------------------------------------------------
Unable to open SU2 restart file solution_flow.dat
------------------------------ Error Exit -------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
I have to apologize for some mistakes I have made when running the commands. It actually worked. Thank you @talbring .
@talbring Thanks for your help. I want to install SU2 by python wrapper build. So I write a definition file based on yours. However, some errors happened. The reason seems to be python environment.
Sorry for troubling you. Could you give me some suggestions, please ?
Best.
Here is my definition file.
Bootstrap: docker
From: ubuntu:18.04
%post
apt-get -y update
apt-get -y upgrade
apt-get -y install python3 python3-pip git build-essential autoconf openmpi-bin openmpi-common libopenmpi-dev m4 gfortran swig vim
pip3 install mpi4py numpy scipy matplotlib
git clone --depth=1 https://github.com/su2code/SU2
cd SU2
mkdir SU2_Install
autoreconf -i
./bootstrap
export CXXFLAGS="-O3 -Wall"
python3 preconfigure.py --enable-autodiff --enable-mpi --enable-PY_WRAPPER --with-cc=/usr/bin/mpicc --with-cxx=/usr/bin/mpicxx --prefix=$PWD/SU2_Install
make -j 4 install
make clean
cd ..
pip3 install tensorforce[tf]
git clone https://github.com/tensorforce/tensorforce.git
cd tensorforce/
git checkout major-revision-final
pip3 install -e .
%runscript
exec /SU2/bin/$1 $2
The error is:
make[3]: Entering directory '/SU2/SU2_BASE/SU2_PY/pySU2'
/bin/bash: python: command not found
swig -DHAVE_MPI -Wall -I/usr/include/python3.6m -I/usr/include/python3.6m -I/root/.local/lib/python2.7/site-packages/mpi4py/include -I/mpi4py/include -I/Library/Python/2.7/site-packages/mpi4py/include -outdir ./ -o SU2_APIPYTHON_wrap.cxx -c++ -python /SU2/SU2_BASE/../SU2_PY/pySU2/pySU2.i
/SU2/SU2_BASE/../SU2_PY/pySU2/pySU2.i:64: Error: Unable to find 'mpi4py/mpi4py.i'
Makefile:532: recipe for target 'SU2_APIPYTHON_wrap.cxx' failed
make[3]: *** [SU2_APIPYTHON_wrap.cxx] Error 1
make[3]: Leaving directory '/SU2/SU2_BASE/SU2_PY/pySU2'
Makefile:525: recipe for target 'all' failed
make[2]: *** [all] Error 2
make[2]: Leaving directory '/SU2/SU2_BASE/SU2_PY/pySU2'
Makefile:441: recipe for target 'install-recursive' failed
make[1]: *** [install-recursive] Error 1
make[1]: Leaving directory '/SU2/SU2_BASE'
Makefile:13: recipe for target 'install-SU2_BASE' failed
make: *** [install-SU2_BASE] Error 2
FATAL: failed to execute %post proc: exit status 2
FATAL: While performing build: while running engine: exit status 255
/SU2/SU2_BASE/../SU2_PY/pySU2/pySU2.i:64: Error: Unable to find 'mpi4py/mpi4py.i'
Use pip to install mpi4py.
PS: just saw you already did that, sorry.
Thank you. But if use pip to install mpi4py
, will it have some negative effects if I use python3 combining with SU2 for further research ?
I build the image using following definition
Bootstrap: docker
From: ubuntu:18.04
%post
apt-get -y update
apt-get -y upgrade
apt-get -y install python3 python3-pip python-dev python-pip git build-essential autoconf openmpi-bin openmpi-common libopenmpi-dev m4 gfortran swig vim
pip3 install mpi4py numpy scipy matplotlib
pip install mpi4py numpy scipy matplotlib
git clone --depth=1 https://github.com/su2code/SU2
cd SU2
mkdir SU2_Install
autoreconf -i
./bootstrap
export CXXFLAGS="-O3 -Wall"
python3 preconfigure.py --enable-autodiff --enable-mpi --enable-PY_WRAPPER --with-cc=/usr/bin/mpicc --with-cxx=/usr/bin/mpicxx --prefix=$PWD/SU2_Install
make -j 4 install
make clean
cd ..
pip3 install tensorforce[tf]
git clone https://github.com/tensorforce/tensorforce.git
cd tensorforce/
git checkout major-revision-final
pip3 install -e .
%runscript
exec /SU2/bin/$1 $2
But it cannot run
ubuntu@main-3:~/main_shared_volume/build_singularity_image/QuickStart$ singularity exec su2_tensorforce.sif /SU2/bin/SU2_CFD inv_NACA0012.cfg
/.singularity.d/actions/exec: 9: exec: /SU2/bin/SU2_CFD: not found
You are installing it in the folder SU2_install/ according to "--prefix=$PWD/SU2_Install" So i think your last line should be: exec /SU2_Install/bin/$1 $2
However I have no experience with this singularity so i could be wrong.
Oh, I see. :sweat_smile: You should be right.
But based on my own experience, I have to use pip
(python2) to install mpi4py
so that I can build the image successfully. I want to know if I use python3 for further development, will it be OK ?
You will need to install it for python3 if you plan to use that. So use: pip3 install mpi4py or python3 -m pip install mpi4py
Yes, I have done it like you said. But it failed. See https://github.com/su2code/SU2/issues/739#issuecomment-515298427
@clarkpede offered a method some days ago, but I am not sure how to edit the Makefile. https://github.com/su2code/SU2/issues/722#issuecomment-506710295
Made it work with this:
Bootstrap: docker
From: ubuntu:19.04
%post
apt-get -y update
apt-get -y install python3 python3-pip git build-essential autoconf python3-dev libopenmpi3 openmpi-common swig
ln -s /usr/bin/python3 /usr/bin/python
python --version
pip3 install mpi4py numpy scipy
git clone --depth=1 https://github.com/su2code/SU2
cd SU2
autoreconf -i
export CXXFLAGS="-O3"
python preconfigure.py --enable-mpi --enable-PY_WRAPPER --prefix=$PWD
make install -j20
make clean
%runscript
exec /SU2/bin/$1 $2
OK. Thank you. I will try it now.
It's running.
A small problem is that I have to change ubuntu:19.04
to ubuntu:18.04
and change libopenmpi3
to libopenmpi-dev openmpi-bin
in the definition, or it will fail.
ubuntu@main-3:~/main_shared_volume/build_singularity_image/builid_image$ sudo singularity build su2_tensorforce.sif su2_tensorforce.def
INFO: Starting build...
Getting image source signatures
Skipping fetch of repeat blob sha256:1eecd0e4c2cd8c1f628b81c53a487aae6c8d4140248a8617313cd73079be09c4
Skipping fetch of repeat blob sha256:fac13afdf65bf403945c8d6bee654a26940c5527a9913bdf8daa54b69a49f550
Skipping fetch of repeat blob sha256:0c6dd534ddf18642a5af19c09c2d9744d0d1aa93680995d430b5257b6eed079d
Skipping fetch of repeat blob sha256:854703cff8700dce5b5ff70e54f5d612ab701405bc200a5b10a0213ca9025e50
Copying config sha256:993d5f573a24af19dd6006bc3e6e113bd0c709797dc48676f4f0b5ed456470cc
2.42 KiB / 2.42 KiB [======================================================] 0s
Writing manifest to image destination
Storing signatures
singularity image-build: relocation error: /lib/x86_64-linux-gnu/libnss_files.so.2: symbol __libc_readline_unlocked version GLIBC_PRIVATE not defined in file libc.so.6 with link time reference
FATAL: While performing build: while running engine: exit status 127
My OS is ubuntu 18.04, and OpenMPI version is 2.1.1. I will take a test to see the reason.
Once if finishes, I will let you know. Thank you.
Yes. Now I can build and run the image. Thank you @talbring @stephansmit .
Dear developers, I met a strange problem when running SU2 in parallel in a docker container. Could you give me some suggestions, please ?
When I run
mpirun --allow-run-as-root -n 24 SU2_CFD inv_NACA0012.cfg
in the container, there is noflow.dat
file. I found a solution at #268. But the output information seems very strange. See SU2_docker_container.logI also have run SU2 in parallel on the host machine with
mpirun -n 24 SU2_CFD inv_NACA0012.cfg
. Everything seems fine and I can getflow.dat
file without extra actions. The output is SU2_host_machine.logThe outputs of these two cases are very different. Maybe #738 can help a little.
Best.