adaptive-cfd / WABBIT

Wavelet Adaptive Block-Based solver for Interactions with Turbulence
https://www.cfd.tu-berlin.de/
GNU General Public License v3.0
56 stars 27 forks source link

With great power comes great responsibility: Allocating large arrays #22

Open Philipp137 opened 6 years ago

Philipp137 commented 6 years ago

Allocating and handling of large data arrays :1st_place_medal:

For best performance try to reduce the time for allocating and deallocating arrays (allocate/deallocate). Especially in functions which are called in every iteration allocating and deallocating arrays adds up and become expensive.

We suggest to use:

  1. the hvy_work(ix,iy,iz,dF) array, which can be used as block data (i.e. data on every local grid point) and is passed to most of the functions inside WABBIT. It is only allocated once in the beginning of the program and deallocated on the end to save our time.

  2. in every routine which uses large temporary arrays, which are not block data arrays, use:

         subroutine my_fun(Np)
               ....
               real, allocatable, save :: my_array(:)
               if (.not. allocated(my_array) allocate(my_array(Np))
               ...
         end subroutine my_fun

    _This bounds the allocated memory to my_fun and thus my_array does not need to be allocated in every function call_

  3. if possible use a global array in your module as a private variable and allocate it once, to reuse it in every function of the module.

Some reasons, why we don`t want to allocate and deallocate arrays in every function call:

For the advanced people: block data in RHS call which is not time evolved :2nd_place_medal:

It would be nice to have a module which provides you with block data, that is not time evolved and can be saved during the time iterations. This can be used for the generation and preservation of masks or other temporary fields.
A possible scenario is that you compute the mask function (or a part of it) only once on the finest level, at the beginning of the program, and coarsen it to the actual mesh level during the time iterations. Another scenario: only compute the mask function when the mask or block has changed (i.e. rank or mesh lvl).

tommy-engels commented 6 years ago

I shall add: be aware of how much memory you allocate! In many routines where you see the allocatable, save :: bla statement, note the size of the array is usually that of a single block. A typical wabbit computation uses several 10'000 blocks or even more, so allocating one or even 100 more is no big deal.

If you want to make, for whatever obnoxious reason, a copy of the entire grid, then the memory consumption of the code "shoots through the roof"! Please do not do that.

Be aware that we pass wabbit the --memory=10GB option, which tell it to, well, use 10GB. Hence, we allocate hvy_block and hvy_work (and a few more) such that 10GB are used. A block more or less is not a problem, but if you decide (please don't) to allocate something huge (please don't), then you have to take that into account at the moment when we decide how many blocks we can allocate for the given memory (please don't)

Philipp137 commented 6 years ago

Bad example #25