precice / fenics-adapter

preCICE-adapter for the open source computing platform FEniCS
GNU Lesser General Public License v3.0
27 stars 13 forks source link

Use `copy(deepcopy=true)` for checkpointing #172

Open BenjaminRodenberg opened 3 weeks ago

BenjaminRodenberg commented 3 weeks ago

There seems to be a bug when using checkpointing. I stumbled across this with subcycling:

Looking at the value of u_n in any of these cases shows that we do not correctly load the checkpoint we originally stored. I assume there is some unintended pointer-magic happening. I used the following debugging statements in solid.py to sample the solution at the tip of the flap:

if precice.requires_writing_checkpoint():  # write checkpoint
    print(f"Store checkpoint: {(u_n(0,H), v_n(0,H), a_n(0,H)),t,n}")
...
if precice.requires_reading_checkpoint():  # roll back to checkpoint
    print(f"Read checkpoint: {(u_cp(0,H), v_cp(0,H), a_cp(0,H)),t_cp,n_cp}")
    print(f"Overwrites: {(u_n(0,H), v_n(0,H), a_n(0,H)),t,n}")
    print(f"Ignores: {u_np1(0,H)}")
    ...
else:
    print(f"u_n: {(u_n(0,H), v_n(0,H), a_n(0,H)),t,n}")
...

The fixes applied here seem to avoid the error.

There are still some todos:

NiklasVin commented 1 week ago

Before I start working on the TODOs of this PR, I wanted to understand what was going on here since I didn't experience any issues in the partitioned heat tutorial. To this end, I ran some tutorials with subcycling and did some research about the copy function.

Tests (without the suggested fix)

  1. Partitioned Heat equation I modified the heat.py file such that only one participant does subcycling. It seems like in this tutorial, the checkpointing doesn't affect the solution (or at least not so severely)
  2. Elastic Tube 3D Similar to the perpendicular flap case, I set fenics_dt=precice_dt/5. The simulation crashed as well
  3. Perpendicular Flap OpenFOAM-OpenFOAM Again, the solid participant does subcycling and this is what you'd get shortly before the program crash: flap
  4. Perpendicular Flap Fluid Fake-FEniCS The picture speaks for itself: flap_fake
  5. Perpendicular Flap Fluid Fake-OpenFOAM After some time, the simulation explodes as well, but before that the flap does not become so wiggly as the one from 4. It is just bending to the right and grows.

What's wrong with copy() ?

Though in the current dolfin documentation about Function.copy(), the dof are deep copied, but as we use legacy FEniCS this is how the copy function is implemented (when I started looking at the documentation, I haven't considered that the dolfin version FEniCS uses is outdated; that's why I am pointing that out). So, I think it is necessary to use a copy(deepcopy=true) with the current state of the adapter.

NiklasVin commented 1 week ago

Since the odd behavior isn't solely restricted to the OpenFOAM-FEniCS perpendicular flap tutorial but also to cases where the FEniCS adapter isn't used at all, my guess would be that there is a bug elsewhere as well.

What do you think?

IshaanDesai commented 1 week ago

Since the odd behavior isn't solely restricted to the OpenFOAM-FEniCS perpendicular flap tutorial but also to cases where the FEniCS adapter isn't used at all, my guess would be that there is a bug elsewhere as well.

The OpenFOAM-OpenFOAM failure definitely makes this more confusing. Otherwise I could imagine that something goes wrong in the Python bindings, and hence all Python-based participants produce failures.

uekerman commented 1 week ago

I could imagine that you have two independent issues here. Maybe ignoring OpenFOAM for the moment is the more sane strategy.