Closed robert-30 closed 1 year ago
The newer versions rely on some logic for permuting dofmaps that was added to the dolfinx Python layer post 0.6.1 release.
I see. Somewhat related to this: when I use v.0.1.0 with dolfinx 0.6, adios4dolfinx crashes if I read in many functions in a row, giving some MPI error. There were some opened file streams (which I see you have patched with the later versions of adios4dolfinx), but even after closing these the issue persists. Do you know what could be causing this?
I see. Somewhat related to this: when I use v.0.1.0 with dolfinx 0.6, adios4dolfinx crashes if I read in many functions in a row, giving some MPI error. There were some opened file streams (which I see you have patched with the later versions of adios4dolfinx), but even after closing these the issue persists. Do you know what could be causing this?
It seems like ADIOS doesn't free its communicators properly (at initialization of ADIOS, the mpi communicator is duplicated). I've not found a nice way to work around this, and usually experience an issue around 1000-2000 calls of the checkpoint functionality.
I would have to redesign the code to only initialize adios once to avoid this.
Okay, thank you very much!
I might have found one issue. I do not explicitly call MPI_Comm.Free(). I'll try to add that and see if it helps
I've been able to improve the following:
import dolfinx
import adios4dolfinx
from mpi4py import MPI
for i in range(10000):
print(i)
mesh = dolfinx.mesh.create_unit_square(MPI.COMM_WORLD, 10, 10)
V = dolfinx.fem.functionspace(mesh, ("Lagrange", 1))
u = dolfinx.fem.Function(V)
adios4dolfinx.write_mesh(mesh, "u.bp", engine="BP4")
adios4dolfinx.write_function(u, "u.bp", engine="BP4")
new_mesh = adios4dolfinx.read_mesh(MPI.COMM_WORLD, "u.bp", engine="BP4", ghost_mode=dolfinx.mesh.GhostMode.shared_facet)
V_new = dolfinx.fem.functionspace(new_mesh, ("Lagrange", 1))
u_new = dolfinx.fem.Function(V_new)
adios4dolfinx.read_function(u_new, "u.bp", engine="BP4")
del u, u_new, mesh, new_mesh
to run 337 times to 1012 times with: https://github.com/jorgensd/adios4dolfinx/pull/33
Got up to 2024 with latest commit.
With the latest fixes, I've been able to run the code for 10 000 iterations, so I believe the comm duplication issue is resolved once #33 is merged
I've now fixed all mpi duplication issues, it is all added in v0.6.0 https://github.com/jorgensd/adios4dolfinx/releases/tag/v0.6.0
In the README it says that v.0.1.0 of adios4dolfinx is compatible with dolfinx 0.6.1. Are the newer versions not compatible?