TDycores-Project / TDycore

BSD 2-Clause "Simplified" License
4 stars 0 forks source link

Memory leak related to primary DM in TDycore #223

Closed jeff-cohere closed 2 years ago

jeff-cohere commented 2 years ago

(From #222 )

If you run valgrind on one of the TH demos, you can see that data within tdy->dm is leaked, in spite of the fact that DMDestroy is called on this DM during TDyDestroy:

From demo/th:

valgrind --leak-check=full --show-leak-kinds=all ./th_driver -malloc 0 -successful_exit_code 0 -dm_plex_simplex 0 -dm_plex_dim 3 -dm_plex_box_faces 2,2,2 -dm_plex_box_lower 0,0,0 -dm_plex_box_upper 1,1,1 -tdy_water_density exponential -tdy_regression_test -tdy_regression_test_num_cells_per_process 2 -tdy_regression_test_filename th-driver-ts-prob1 -tdy_final_time 3.1536 -tdy_dt_max 600. -tdy_dt_growth_factor 1.5 -tdy_init_with_random_field -tdy_time_integration_method TS

Valgrind says (among other things):

...
==4508== 591,281 (5,000 direct, 586,281 indirect) bytes in 1 blocks are definitely lost in loss record 1,521 of 1,521
==4508==    at 0x4848060: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==4508==    by 0x4AAD55D: PetscMallocAlign (mal.c:54)
==4508==    by 0x4AAFC05: PetscMallocA (mal.c:423)
==4508==    by 0x5BC91D7: DMCreate (dm.c:55)
==4508==    by 0x5DA4ECD: DMPlexCreate (plexcreate.c:3346)
==4508==    by 0x4915527: TDySetFromOptions (tdycore.c:630)
==4508==    by 0x4018B1: main (th_driver.c:31)
...

I have verified that this is not related to DM distribution, and that for a serial run, the leak appears, and the DM created with DMPlexCreate (within TDySetFromOptions) has the same address as that passed to DMDestroy.

Finally, I don't know whether this is related, but in the version of PETSc we're currently using I see the following at the end of DMDestroy (in dm.c):

  /* We do not destroy (*dm)->data here so that we can reference count backend objects */
  ierr = PetscHeaderDestroy(dm);CHKERRQ(ierr);

This suggests that we could have a dangling reference to the DM somewhere.

jeff-cohere commented 2 years ago

From @knepley (in #222):

It is not the backend (it was my comment trying to understand the PETSc convention). It is very likely that we either a) have something else keeping the DM alive, like an unfreed Vec, or b) have a reference cycle. However, both of those would mean there is at least one more thing unfreed.