tianjuxue / jax-am

Additive manufacturing simulation with JAX.
https://jax-am.readthedocs.io/en/latest/
GNU General Public License v3.0
268 stars 56 forks source link

Memory leakage during optimization #30

Open SNMS95 opened 1 year ago

SNMS95 commented 1 year ago

It seems there is come memory leakage somewhere. It can be clearly seen if the topopt example is run and you monitor the memory. It is steadily increasing. This results in OOM-KILL events on HPCs.

  1. Part of the reason is not destroying the PETSc objects. This is easily fixed.
  2. We need some tool to do this everytime we make changes [Scalene perhaps?]
  3. There is another source that I have not been able to pinpoint yet
SNMS95 commented 1 year ago

Add to end:

    if use_petsc:
        A_fn.destroy()
    else:
        del A_fn
    del dofs
    del res_vec
    gc.collect()
    return sol

b. https://github.com/tianjuxue/jax-am/blob/1c321e58105e671d81444c0a8b97f37ea877567f/jax_am/fem/solver.py#L17

    # Delete PETSc objects
    ksp.destroy()
    rhs.destroy()
    y.destroy()
    A.destroy()
    result = x.getArray()
    x.destroy()
    gc.collect()
    return result

c.https://github.com/tianjuxue/jax-am/blob/1c321e58105e671d81444c0a8b97f37ea877567f/jax_am/fem/solver.py#L319

    problem.A_sp_scipy_diag = A_sp_scipy.diagonal()
    ...
    del A_sp_scipy
    del A_sp
SNMS95 commented 1 year ago

I think we should make garbage collection as a standard practice when collaborating