underworldcode / underworld2

underworld2: A parallel, particle-in-cell, finite element code for Geodynamics.
http://www.underworldcode.org/
Other
168 stars 58 forks source link

Mesh loading issue during restarting (free top surface condition) #657

Closed Peigen-L closed 1 year ago

Peigen-L commented 1 year ago

Hi all,

I am creating a 2D subduction model with free top surface. When I was trying to restart the model, the restarting process stuck at mesh loading procedure and report error as:

Traceback (most recent call last):
  File "/scratch/jq14/lp5029_model/2Dmodel_air/RM/./Free_surface_subduction.py", line 661, in <module>
    Model.run_for(nstep=6000, checkpoint_interval=40, restartStep=5, restartDir="/scratch/jq14/lp5029_model/2Dmodel_air/RM/output-1e-06-0-512x192_np48/")
  File "/g/data/m18/software/underworld/2.13/lib/python3.9/site-packages/underworld/UWGeodynamics/_model.py", line 1637, in run_for
    self.restart(restartStep, restartDir)
  File "/g/data/m18/software/underworld/2.13/lib/python3.9/site-packages/underworld/UWGeodynamics/_model.py", line 436, in restart
    _RestartFunction(self, restartDir).restart(step)
  File "/g/data/m18/software/underworld/2.13/lib/python3.9/site-packages/underworld/UWGeodynamics/_model.py", line 2720, in restart
    self.reload_mesh(step)
  File "/g/data/m18/software/underworld/2.13/lib/python3.9/site-packages/underworld/UWGeodynamics/_model.py", line 2815, in reload_mesh
    Model.mesh.load(os.path.join(self.restartDir, "mesh.h5"))
  File "/g/data/m18/software/underworld/2.13/lib/python3.9/site-packages/underworld/mesh/_mesh.py", line 639, in load
    with h5File(name=filename, mode="r") as h5f:
  File "/g/data/m18/software/underworld/2.13/lib/python3.9/site-packages/underworld/utils/_io.py", line 98, in __enter__
    self.h5f = h5py.File(*self.args, **self.kwargs)
  File "/g/data/m18/software/underworld/2.13/lib/python3.9/site-packages/h5py/_hl/files.py", line 533, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
  File "/g/data/m18/software/underworld/2.13/lib/python3.9/site-packages/h5py/_hl/files.py", line 226, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 106, in h5py.h5f.open
OSError: Unable to open file (MPI_ERR_NO_SUCH_FILE: no such file or directory)

And in the job log, the restarting process stuck at mesh loading as:

Assigning material properties...
    Global element size: 512x192
    Local offset of rank 0: 0x0
    Local range of rank 0: 43x48
Calling init_model()...
================================================================================

Restarting Model from Step 5 at Time = 6574116.840417398 year

(2023-03-24 23:22:22)
================================================================================

I noticed that the auto-generated file name for the deformed mesh in the free surface model is different. For free-slip model, the mesh doesn't deform and the file is named as: mesh.h5 For free surface model, the mesh deform and the mesh file is different for each timestep and saved as: mesh-5.h5

The restarting error is due to the change in mesh output for the free surface model?

julesghub commented 1 year ago

Hi Peigen,

I suspect the error can be fixed by making this line. https://github.com/underworldcode/underworld2/blob/b72fc39337db6c36a68b585540262c14cc9fc59e/underworld/UWGeodynamics/_model.py#L2812

include this condition if Model._advector or Model._freeSurface:

Are you able to test this code locally with the change and your model? I can make a branch/PR if this would help the above.

Peigen-L commented 1 year ago

Hi Julian,

Here is the link for the free surface boundary condition testing:

https://github.com/underworld-community/Peigen-HPC/blob/30a31e09a41311e182e425a319d0633f020271c0/Free_surface.py#L568

I have another question with this line: https://github.com/underworldcode/underworld2/blob/b72fc39337db6c36a68b585540262c14cc9fc59e/underworld/UWGeodynamics/_model.py#L2812-L2815

Under this setting if Model.freeSurface = True the restarting model would load mesh.h5 then, which is not the case for the free surface condition as it should load mesh-step.h5

julesghub commented 1 year ago

Thanks for the test model. I have made a fix for the code. I also made some light changes to the test code. Will push it now.

The changes I made are on the 2.14.x branch. I'll create a 2.14.2b release soon to address the change. In the meantime if you can try the code with the change, and let me know, that would be great.

Peigen-L commented 1 year ago

Hi Julian,

I think the change is working just fine. Closing this issue.