precice / tutorials

Various tutorial cases for the coupling library preCICE with real solvers. These files are meant to be rendered on precice.org, so don't look at the README files here.
https://www.precice.org/
GNU Lesser General Public License v3.0
103 stars 105 forks source link

Error running perpendicular-flap/fluid-openfoam in parallel #218

Open efirvida opened 3 years ago

efirvida commented 3 years ago

Hi, I'm trying to run this tutorial in parallel just using `run.sh -parallel but always fail. I tried several configurations of decomposeParDict until I found that it only runs in parallel if I use this setting....

numberOfSubdomains 2;

method          simple;

simpleCoeffs
{
    n               (2 1 1);
    delta           0.001;
}

I'm running the preCICE adapters built with EasyBuild and the easyconfigs that I have made see it here: https://github.com/efirvida/easybuild-easyconfigs/commit/62611dc79313063019bce90ba83f42081c1fd998, So I'm really don't know if I have a mistake in my easyconfigs or is a tutorial error.

I have plans to submit the easyconfigs to the main EasyBuild repo but to do it I have to be sure that they work, and then follow my research on FSI.

Another thing that may be important to say is that I'm using Fedora 34 and I have some problems building the foss-2020a toolchain due to Binutils 2.34 bug (https://bugzilla.redhat.com/show_bug.cgi?id=1916925) and I change the version of the Binutils to 2.36.1 and Bison to 3.7.6 to the whole toolchain, and that's the main reason of my branch here https://github.com/efirvida/easybuild-easyconfigs/tree/fsi, I don't know if this introduces some bugs to the library.

/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  v2012                                 |
|   \\  /    A nd           | Website:  www.openfoam.com                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : _7bdb509494-20201222 OPENFOAM=2012
Arch   : "LSB;label=32;scalar=64"
Exec   : blockMesh
Date   : Jun 14 2021
Time   : 19:12:20
Host   : Naboo
PID    : 2943179
I/O    : uncollated
Case   : /home/efirvida/Desktop/dev/PHD/tutorials/perpendicular-flap/fluid-openfoam
nProcs : 1
trapFpe: Floating point exception trapping enabled (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 5, maxFileModificationPolls 20)
allowSystemOperations : Allowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Creating block mesh from "system/blockMeshDict"
Creating block edges
No non-planar block faces defined
Creating topology blocks
Creating topology patches

Creating block mesh topology

Check topology

    Basic statistics
        Number of internal faces : 4
        Number of boundary faces : 22
        Number of defined boundary faces : 22
        Number of undefined boundary faces : 0
    Checking patch -> block consistency

Creating block offsets
Creating merge list (topological search)...
Deleting polyMesh directory "constant/polyMesh"

Creating polyMesh from blockMesh
Creating patches
Creating cells
Creating points with scale 1
    Block 0 cell size :
        i : 0.136132 .. 0.0680662
        j : 0.0666667 .. 0.0666667
        k : 1 .. 1

    Block 1 cell size :
        i : 0.0680662 .. 0.136132
        j : 0.0666667 .. 0.0666667
        k : 1 .. 1

    Block 2 cell size :
        i : 0.136132 .. 0.0680662
        j : 0.0692199 .. 0.13844
        k : 1 .. 1

    Block 3 cell size :
        i : 0.0333333 .. 0.0333333
        j : 0.0692199 .. 0.13844
        k : 1 .. 1

    Block 4 cell size :
        i : 0.0680662 .. 0.136132
        j : 0.0692199 .. 0.13844
        k : 1 .. 1

There are no merge patch pairs

Writing polyMesh with 0 cellZones
----------------
Mesh Information
----------------
  boundingBox: (-3 0 0) (3 4 1)
  nPoints: 5828
  nCells: 2790
  nFaces: 11283
  nInternalFaces: 5457
----------------
Patches
----------------
  patch 0 (start: 5457 size: 45) name: inlet
  patch 1 (start: 5502 size: 45) name: outlet
  patch 2 (start: 5547 size: 33) name: flap
  patch 3 (start: 5580 size: 63) name: upperWall
  patch 4 (start: 5643 size: 60) name: lowerWall
  patch 5 (start: 5703 size: 5580) name: frontAndBack

End

/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  v2012                                 |
|   \\  /    A nd           | Website:  www.openfoam.com                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : _7bdb509494-20201222 OPENFOAM=2012
Arch   : "LSB;label=32;scalar=64"
Exec   : decomposePar -force
Date   : Jun 14 2021
Time   : 19:12:20
Host   : Naboo
PID    : 2943189
I/O    : uncollated
Case   : /home/efirvida/Desktop/dev/PHD/tutorials/perpendicular-flap/fluid-openfoam
nProcs : 1
trapFpe: Floating point exception trapping enabled (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 5, maxFileModificationPolls 20)
allowSystemOperations : Allowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Decomposing mesh region0

Removing 2 existing processor directories
Create mesh

Calculating distribution of cells
Selecting decompositionMethod simple [4]

Finished decomposition in 0.01 s

Calculating original mesh data

Distributing cells to processors

Distributing faces to processors

Distributing points to processors

Constructing processor meshes

Processor 0
    Number of cells = 698
    Number of faces shared with processor 1 = 8
    Number of faces shared with processor 2 = 31
    Number of processor patches = 2
    Number of processor faces = 39
    Number of boundary faces = 1465

Processor 1
    Number of cells = 697
    Number of faces shared with processor 0 = 8
    Number of faces shared with processor 3 = 33
    Number of processor patches = 2
    Number of processor faces = 41
    Number of boundary faces = 1463

Processor 2
    Number of cells = 697
    Number of faces shared with processor 0 = 31
    Number of faces shared with processor 3 = 23
    Number of processor patches = 2
    Number of processor faces = 54
    Number of boundary faces = 1448

Processor 3
    Number of cells = 698
    Number of faces shared with processor 1 = 33
    Number of faces shared with processor 2 = 23
    Number of processor patches = 2
    Number of processor faces = 56
    Number of boundary faces = 1450

Number of processor faces = 95
Max number of cells = 698 (0.0716846% above average 697.5)
Max number of processor patches = 2 (0% above average 2)
Max number of faces between processors = 56 (17.8947% above average 47.5)

Time = 0

Processor 0: field transfer
Processor 1: field transfer
Processor 2: field transfer
Processor 3: field transfer

End

/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  v2012                                 |
|   \\  /    A nd           | Website:  www.openfoam.com                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : _7bdb509494-20201222 OPENFOAM=2012
Arch   : "LSB;label=32;scalar=64"
Exec   : pimpleFoam -parallel
Date   : Jun 14 2021
Time   : 19:12:23
Host   : Naboo
PID    : 2943207
I/O    : uncollated
Case   : /home/efirvida/Desktop/dev/PHD/tutorials/perpendicular-flap/fluid-openfoam
nProcs : 4
Hosts  :
(
    (Naboo 4)
)
Pstream initialized with:
    floatTransfer      : 0
    nProcsSimpleSum    : 0
    commsType          : nonBlocking
    polling iterations : 0
trapFpe: Floating point exception trapping enabled (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 5, maxFileModificationPolls 20)
allowSystemOperations : Allowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 0

Selecting dynamicFvMesh dynamicMotionSolverFvMesh
Selecting motion solver: displacementLaplacian
Applying solid body motion to entire mesh
Selecting motion diffusion: quadratic
Selecting motion diffusion: inverseDistance
Selecting patchDistMethod meshWave

PIMPLE: Operating solver in PISO mode

Reading field p

Reading field U

Reading/calculating face flux field phi

Selecting incompressible transport model Newtonian
Selecting turbulence model type laminar
Selecting laminar stress model Stokes
No MRF models present

No finite volume options present
Constructing face velocity Uf

Courant Number mean: 0 max: 0

Starting time loop

---[preciceAdapter] Loaded the OpenFOAM-preCICE adapter v1.0.0.
---[preciceAdapter] Reading preciceDict...
---[precice]  This is preCICE version 2.2.1
---[precice]  Revision info: no-info [Git failed/Not a repository]
---[precice]  Configuration: Release (Debug and Trace log unavailable)
---[precice]  Configuring preCICE with configuration "../precice-config.xml"
---[precice]  I am participant "Fluid"
---[precice]  Connecting Master to 3 Slaves
[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[2]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[2]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[2]PETSC ERROR: to get more information on the crash.
[2]PETSC ERROR: User provided function() line 0 in  unknown file  
[3]PETSC ERROR: ------------------------------------------------------------------------
[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[3]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[3]PETSC ERROR: to get more information on the crash.
[3]PETSC ERROR: User provided function() line 0 in  unknown file  
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
---[precice]  Setting up master communication to coupling partner/s
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: User provided function() line 0 in  unknown file  
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[1]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[1]PETSC ERROR: to get more information on the crash.
[1]PETSC ERROR: User provided function() line 0 in  unknown file  
[Naboo:2943192] 3 more processes have sent help message help-mpi-api.txt / mpi-abort
[Naboo:2943192] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
MakisH commented 3 years ago

I can reproduce this with OpenFOAM v2012 (installed from .deb) on Ubuntu 21.04, with preCICE v2.2.1 (built from source). My system has only two physical cores, and I use export OMPI_MCA_rmaps_base_oversubscribe=1 in my ~/.bashrc.

It does not seem to matter if the interface is "cut" by the parallel boundary:

Since people have used the OpenFOAM adapter with more ranks and since we have also ran e.g. the turek-hron-fsi3 case with 25 ranks, this should be specific to the tutorial or the system.

@efirvida how many physical & logical cores do you have on your system?

efirvida commented 3 years ago

@MakisH I'm running on a laptop with a i7-8650U, so I have 4 cores with 2 threads each, I test the old version of the tutorial rolling back the repository to the commit 5f4031fc7e45807dca787a525569b39a1909d2a3, and it works fine. I use the -oversubscribe too and testit up to 12 partitions, I haven't time to compare the tutorials to see what's different, and also I haven't much experience with preCICE yet, but the old version didn't fail on any of my tests.

davidscn commented 3 years ago

I think the crucial factor here is whether the master rank of OpenFOAM owns interface nodes or not. IIRC I had already a similar issue in the past. I'm still a bit puzzled whether the issue is triggered from the OpenFOAM side or from the preCICE side. I have some cases to test.. a workaround should still be given by this approach .

davidscn commented 3 years ago

I think it is an issue in the adapter rather than preCICE. Some corner cases with empty master ranks were fixed in the preCICE bugfix release v2.2.1. and IIRC I already ran empty master cases with other solver. I need to build the adapter in debug mode (CXX_FLAG='-g') to get more information here:

[2] #3  ? at Interface.C:?
[0] #4  preciceAdapter::Interface::Interface(precice::SolverInterface&, Foam::fvMesh const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, bool) at Interface.C:?
[2] #4  preciceAdapter::Interface::Interface(precice::SolverInterface&, Foam::fvMesh const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, bool) at ??:?
[0] #5  preciceAdapter::Adapter::configure() at ??:?
[2] #5  preciceAdapter::Adapter::configure() at ??:?
[0] #6  Foam::functionObjects::preciceAdapterFunctionObject::read(Foam::dictionary const&) at ??:?
[2] #6  Foam::functionObjects::preciceAdapterFunctionObject::read(Foam::dictionary const&) at ??:?
[0] #7  Foam::functionObjects::preciceAdapterFunctionObject::preciceAdapterFunctionObject(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[2] #7  Foam::functionObjects::preciceAdapterFunctionObject::preciceAdapterFunctionObject(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[0] #8  Foam::functionObject::adddictionaryConstructorToTable<Foam::functionObjects::preciceAdapterFunctionObject>::New(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[2] #8  Foam::functionObject::adddictionaryConstructorToTable<Foam::functionObjects::preciceAdapterFunctionObject>::New(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[0] #9  Foam::functionObject::New(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[2] #9  Foam::functionObject::New(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[0] #10  Foam::functionObjectList::read() at ??:?
[2] #10  Foam::functionObjectList::read() at ??:?
[0] #11  Foam::Time::run() const at ??:?
[2] #11  Foam::Time::run() const at ??:?
davidscn commented 3 years ago

I can confirm that I can successfully run test cases (no OpenFOAM) where the master rank is not located at the interface.