Closed prj- closed 1 year ago
@prj, to solve this leak, PETSc needs a call to hypre_HandleDestroy
:
and this function is usually called via HYPRE_Finalize
. But I see that PETSc does not call the pair HYPRE_Init
and HYPRE_Finalize
in PetscInitialize
and PetscFinalize
, respectively. Is there a particular for that?
That's an excellent question... Maybe @stefanozampini knows?
That's an excellent question... Maybe @stefanozampini knows?
No particular reason. It was not actually needed to call those functions for quite a long time. What do they actually do? Someone has to code it properly when registering the classes and we need to be sure we can safely call hypre finalize . Are there global variables that tells us the status of the initialization?
That's an excellent question... Maybe @stefanozampini knows?
No particular reason. It was not actually needed to call those functions for quite a long time. What do they actually do? Someone has to code it properly when registering the classes and we need to be sure we can safely call hypre finalize . Are there global variables that tells us the status of the initialization?
We introduced the HYPRE_Init and Finalize functions in v.2.20.0. The initializations are mainly for GPUs, creating GPU library handles, global variables, etc. We recommend the call these two functions in all cases. Otherwise, memory leak would happen as we can see here.
Just adding to Rui Peng's reply:
Are there global variables that tells us the status of the initialization?
There isn't a global variable for that in hypre.
Note that this kind of memory leak, which happens because we do not call the hypre_finalize(), should be fixed, but it is not a big deal. The problematic memory leaks occur because we are not properly freeing some resources along the way in the computations, so we start wasting more and more memory that is no longer accessible, and eventually, the process runs out of memory. Are there any of those sorts of memory leaks that we can fix in PETSc?
Looks like you want the GPU device initialized before calling hypre_init()?
And that all hypre solves use the GPU in a run or none? Is there a way to have some hypre solves on the CPU and some on the GPU determined by the user?
cudaSetDevice(device_id); /* GPU binding */
...
HYPRE_Init(); /* must be the first HYPRE function call */
HYPRE_SetMemoryLocation(HYPRE_MEMORY_DEVICE);
/* setup AMG on GPUs */
HYPRE_SetExecutionPolicy(HYPRE_EXEC_DEVICE);
Hi, everyone. Is there any update on this? I reported this memory bug to the petsc developers a while ago. Now beyond the original Euclid smoother, there is a good motivation that I want to add the hypre's AIR preconditioner option into it. I would guess it will have the same memory issue if I just do it naively.
More specifically, to start with, I would like to add the following option into petsc's hypre interface (mfem::HypreBoomerAMG::SetAdvectiveOptions
):
https://docs.mfem.org/html/classmfem_1_1HypreBoomerAMG.html#a7ce86f2bda044e0cedb9ef73fef8c5bc
However, I am worried that I will have the same issue in a time dependent simulation that the memory just grows linearly. Note that the AIR pc needs another allocations of grid_relax_points
in hypre, which is currently missing on petsc.
Is there any suggestion on how to do both (fix the memory bug and add the AIR option into petsc)? Thanks.
We already have some AIR options https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/pc/impls/hypre/hypre.c#L923. I'm no expert on this, so if you have improvements on that, please open a PR
^ @stefanozampini currently petsc is missing a very important part of AIR which is the FC relaxation. This requires a small array to be passed to hypre as @tangqi mentioned and can be seen in the MFEM wrapper https://github.com/mfem/mfem/blob/master/linalg/hypre.cpp#L5131.
@bensworth I'll try to find some time to implement it, thanks for pointing it out. Can you suggest a small numerical test to check everything is working as expected?
I’d be happy to try this out because the AIR performance in PETSc are terrible, worse than plain BoomerAMG for advection-dominated problem, which should not be the case.
@prj- Thanks!
Great! I will write up a simple example today where two-level AIR is (theoretically) exact in one iteration and should be a good unit test. One other comment : we don't have a block version of AIR in hypre (only in PyAMG), so the work around for DG advective problems is to scale by the (element) block diagonal inverse (this makes the matrix amenable to a scalar as opposed to block reduction method). This is also in hypre, and the MFEM wrapper here: https://github.com/mfem/mfem/blob/master/linalg/hypre.cpp#L2762. It would be nice to have access to this function for DG advective discretizations.
@prj- once we get a complete petsc interface, if you have interesting advection-dominmated problems where you still aren't seeing good performance feel free to reach out. Sometimes there are simple modifications to the discretization that make solver performance a lot better, or we have a number of development variations of AIR in PyAMG that we could test if you print your matrix to file or wrap it in petsc4py.
We got lost exchanging about ℓAIR at the end of this issue, but the prior discussion about HYPRE_[Init|Finalize]()
to solve the leak is actually more relevant than ever, see discussion in https://github.com/hypre-space/hypre/pull/871#issuecomment-1503492038.
We introduced the HYPRE_Init and Finalize functions in v.2.20.0. The initializations are mainly for GPUs, creating GPU library handles, global variables, etc. We recommend the call these two functions in all cases. Otherwise, memory leak would happen as we can see here.
@jedbrown is always concerned about initializing resources, even though they may never be used. For example, if I run a PETSc program that calls HPRE_Init() in PetscInitialize() this means my program is always grabbing GPU resources every time it is run even if the program itself never ends up using hypre (very command since PETSc programs are options database driven). This is painful on shared systems (where other people may be using the GPU) and during CI where the program is running 1000s of times, each time trying to grab the GPU. Thus PETSc has "lazy" initialization of GPU which allows delaying any touching of the GPU until the code knows it will actually use the GPU.
Is there any chance you would consider adding lazy GPU initialization to hypre so the GPU is not always grabbed when the program starts up?
We introduced the HYPRE_Init and Finalize functions in v.2.20.0. The initializations are mainly for GPUs, creating GPU library handles, global variables, etc. We recommend the call these two functions in all cases. Otherwise, memory leak would happen as we can see here.
@jedbrown is always concerned about initializing resources, even though they may never be used. For example, if I run a PETSc program that calls HPRE_Init() in PetscInitialize() this means my program is always grabbing GPU resources every time it is run even if the program itself never ends up using hypre (very command since PETSc programs are options database driven). This is painful on shared systems (where other people may be using the GPU) and during CI where the program is running 1000s of times, each time trying to grab the GPU. Thus PETSc has "lazy" initialization of GPU which allows delaying any touching of the GPU until the code knows it will actually use the GPU.
Is there any chance you would consider adding lazy GPU initialization to hypre so the GPU is not always grabbed when the program starts up?
Hi @BarrySmith , there are users at Livermore as well having the same concern of GPU initialization. We are cooking up a lightweight HYPRE_Init
, which is more or less the same as the "lazy" initialization idea you mentioned.
Thanks!
Here is how to reproduce the leak.