underworldcode / underworld2

underworld2: A parallel, particle-in-cell, finite element code for Geodynamics.
http://www.underworldcode.org/
Other
162 stars 58 forks source link

Model.mesh.deform_mesh() error with mpirun. Possibly two nodes are in an identical location. #686

Open phamngockien opened 5 months ago

phamngockien commented 5 months ago

Hi all,

I am trying to test mesh.deform function. It did work well with serial run. However, it returned an error with mpirun when the number of elements in x > number of elements in y. The traceback is: uw.libUnderworld.StgDomain.Mesh_DeformationUpdate( self._cself ) RuntimeError: Error encountered. Full restart recommended as exception safety not guaranteed. Error message: An error occurred when checking mesh metrics. Possibly two nodes are in an identical location.

It may be due to the StgDomain distribute local rank related to the direction having larger number of element. Let's assume using 2 processors (mpirun -np 2 ...), underworld gives Global element size: 40x30 Local offset of rank 0: 0x0 Local range of rank 0: 20x30 Thus, if I change the value of Model.mesh.data[index][1] in the deform_mesh() function, the two processors may not know how the y coordinate has been changed in another process, and then return the error.

I also attach the test file below: mesh_deform_mpi_test.txt

Do we have a way to overcome this issue?

Thank you in advance.

Best regards,

Pham Ngoc Kien Ph.D. Student School of Earth and Environmental Sciences Seoul National University