smenon / dynamicTopoFvMesh

Parallel Adaptive Simplical Remeshing for OpenFOAM
http://smenon.github.com/dynamicTopoFvMesh/
38 stars 20 forks source link

crashed for dynamicTopoFvMesh-port 2.3 and mesquite smoother in parallel #9

Closed JunweiSu closed 10 years ago

JunweiSu commented 10 years ago

Hi Sandeep Can dynamicTopoFvMesh-port-2.3 run in parallel? I encountered the following problem when run a case in a 3D case in parallel.

The error message is [35] Could not find correct patch info: [35] sEdge: (1757 1759) [35] seIndex: 6643 [35] sePatch: 8 [35] neiProcPatch: -1 [35] neiProcNo: 39 [35] proc: 38 [35] cMs: -1 cMe: -1 [35] Patch Name: procBoundary38to39[35] [35] [35] --> FOAM FATAL ERROR: [35] [35] FOAM parallel run aborting [35] [35] #0 Foam::error::printStack(Foam::Ostream&)-------------------------------------------------------------------------- An MPI process has executed an operation involving a call to the "fork()" system call to create a child process. Open MPI is currently operating in a condition that could result in memory corruption or other system errors; your MPI job may hang, crash, or produce silent data corruption. The use of fork() (or system() or other calls that create child processes) is strongly discouraged.

The process that invoked fork was:

Local host: node3 (PID 16651) MPI_COMM_WORLD rank: 35

If you are absolutely sure that your application will successfully and correctly survive a call to fork(), you may disable this warning

by setting the mpi_warn_on_fork MCA parameter to 0.

in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/lib/libOpenFOAM.so" [35] #1 Foam::error::abort() in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/lib/libOpenFOAM.so" [35] #2 Foam::Ostream& Foam::operator<< Foam::error(Foam::Ostream&, Foam::errorManipFoam::error) in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/bin/moveDynamicMesh" [35] #3 Foam::dynamicTopoFvMesh::insertCells(int) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [35] #4 Foam::dynamicTopoFvMesh::removeEdgeFlips(int, double, Foam::List const&, Foam::PtrListFoam::List<Foam::List >&, Foam::PtrListFoam::List<Foam::List >&, Foam::PtrListFoam::List<Foam::List >&, int) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [35] #5 Foam::dynamicTopoFvMesh::swap3DEdges(void*) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [35] #6 Foam::dynamicTopoFvMesh::handleCoupledPatches(Foam::HashSet<int, Foam::Hash >&) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [35] #7 Foam::dynamicTopoFvMesh::threadedTopoModifier() in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [35] #8 Foam::dynamicTopoFvMesh::update() in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [35] #9
[35] in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/bin/moveDynamicMesh" [35] #10 __libc_start_main in "/lib64/libc.so.6" [35] #11
[34] Could not find correct patch info: [34] sEdge: (480 601) [34] seIndex: 7416 [34] sePatch: 8 [34] neiProcPatch: -1 [34] neiProcNo: 38 [34] proc: 37 [34] cMs: 3443 cMe: 8515 [34] Patch Name: procBoundary37to38[34] [34] [34] --> FOAM FATAL ERROR: [34] [34] FOAM parallel run aborting [34]

[34] #0 Foam::error::printStack(Foam::Ostream&)[35] in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/bin/moveDynamicMesh"

MPI_ABORT was invoked on rank 35 in communicator MPI_COMM_WORLD with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on

exactly when Open MPI kills them.

[33] [33] [33] --> FOAM FATAL ERROR: [33] hanging pointer of type N4Foam9polyPatchE at index -39 (size 11), cannot dereference [33] [33] From function PtrList::operator[] const [33] in file /public/software/OpenFOAM/OpenFOAM-2.3.0/src/OpenFOAM/lnInclude/PtrListI.H at line 160. [33] FOAM parallel run aborting [33] [33] #0 Foam::error::printStack(Foam::Ostream&) in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/lib/libOpenFOAM.so" [34] #1 Foam::error::abort() in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/lib/libOpenFOAM.so" [34] #2 in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/lib/libOpenFOAM.so" [33] #1 Foam::error::abort()[32] #0 Foam::error::printStack(Foam::Ostream&)Foam::Ostream& Foam::operator<< Foam::error(Foam::Ostream&, Foam::errorManipFoam::error)-------------------------------------------------------------------------- mpirun has exited due to process rank 35 with PID 16651 on node node3 exiting improperly. There are two reasons this could occur:

  1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination.
  2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be

terminated by signals sent by mpirun (as reported here).

in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/bin/moveDynamicMesh" [34] #3 Foam::dynamicTopoFvMesh::insertCells(int) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [34] #4 Foam::dynamicTopoFvMesh::removeEdgeFlips(int, double, Foam::List const&, Foam::PtrListFoam::List<Foam::List >&, Foam::PtrListFoam::List<Foam::List >&, Foam::PtrListFoam::List<Foam::List >&, int) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [34] #5 Foam::dynamicTopoFvMesh::swap3DEdges(void*) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [34] #6 Foam::dynamicTopoFvMesh::handleCoupledPatches(Foam::HashSet<int, Foam::Hash >&) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [34] #7 Foam::dynamicTopoFvMesh::threadedTopoModifier() in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [34] #8 Foam::dynamicTopoFvMesh::update() in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [34] #9 in "/public/software/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64IccDPOpt/lib/libOpenFOAM.so" [33] #2 Foam::dynamicTopoFvMesh::getNeighbourProcessor(int) const in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [33] #3 Foam::dynamicTopoFvMesh::setFaceMapping(int, Foam::List const&) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [33] #4 Foam::dynamicTopoFvMesh::insertCells(int) in "/public/home/sjw/OpenFOAM/sjw-2.3.0/platforms/linux64IccDPOpt/lib/libdynamicTopoFvMesh.so" [33] #5 Foam::dynamicTopoFvMesh::removeEdgeFlips(int, double, Foam::List const&, Foam::PtrListFoam::List<Foam::List >&, Foam::PtrListFoam::List<Foam::List >&, Foam::PtrListFoam::List<Foam::List >&, int)[node253:25588] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork [node253:25588] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

In the meanwhile, I found that the refinement process is rather slow. Would you please give me some hints on the options for speeding up the process.

Best regards, Junwei

smenon commented 10 years ago

If you can provide a test case that can reproduce the issue, I can take a look.

JunweiSu commented 10 years ago

Hi smenon Thank you very much. You can download the case through the following link. http://pan.baidu.com/s/1dD3WbeT The mesh contain 1252541 elements, and I run it (moveDynamicMesh) with 60 cores in parallel.

smenon commented 10 years ago

There's a few problems with this case:

  1. You specify the fixedLengthScalePatches for MOVEB as 1, while the length scale on that patch is closer to 0.1, which explains the large number of collapses.
  2. You have all boundaries defined as a single patch called WALL (and likewise for MOVEB), which is not exactly optimal, since collapses can cause "pinching" effects, where the sharp edges lose definition. An option would be to play around with the swapDeviation so that this is avoided, or split the boundary into multiple patches.
  3. Running 60-way parallel is a bit hard to debug. I would suggest trying to reproduce the issue on something smaller. Perhaps you could try and get a small case working the way you want in serial, and then try parallel next.

I'll close this issue for now. Let me know when you have something I can work with.