Closed tingyu66 closed 8 years ago
Generating the RHS plan will always cause the double allocation -- it is literally creating another copy of the tree in order to calculate the matvec that results in the b vector.
The actual allocation will be in the kernel (for instance LaplaceSpherical.hpp, in the init_mulitpole, init_local methods). More allocation can be done when creating the sparse local matrix (include/executor/EvalP2P.hpp, to_matrix method)
Oh, I see, thanks for the reply
I used massif tool in valgrind to monitor the heap usage of StokesBEM running on phantom (for 4 threads). I didn't recorded the stack usage for StokesBEM since it would slow down valgrind a lot for a large problem size (recursions=6).
In the figures below, x-axis shows the number of instructions executed, y-axis gives how much heap memory is used.
The heap usage peaks when the code is generating RHS. There is an almost doubled memory usage at that point. When I enable lazy evaluation, I think it is creating the RHS plan that doubles the memory usage. (line 198 in LaplaceBEM.cpp)
FMM_plan<kernel_type> rhs_plan = FMM_plan<kernel_type>(K,panels,opts);
I read the class definition of FMM_plan but could not find where the code needs to allocate extra heap. Would you recall and help me on that? @slayton58 Thanks.
Here is the stack+heap usage on a smaller case (recursions=4). I found a similar peak.