ECP-WarpX / WarpX

WarpX is an advanced electromagnetic & electrostatic Particle-In-Cell code.
https://ecp-warpx.github.io
Other
306 stars 195 forks source link

Always compile with `USE_EB=ON` #3280

Open roelof-groenewald opened 2 years ago

roelof-groenewald commented 2 years ago

There has been some discussion on removing the compile time argument USE_EB and instead always support inclusion of embedded boundaries. The open question before that can be done is whether there is a performance penalty for non-EB containing simulations when running with a USE_EB=ON compilation. As a first step in answering this I ran a test with the non-EB simulation from WarpX/Examples/Physics_applications/capacitive_discharge/inputs_2d (see https://github.com/ECP-WarpX/WarpX/issues/3248#issuecomment-1197390221). The fab allocation comparison is copied below:

Compiled without EB support:

### FabArray ###
    tot # of builds       : 29680
    max # of FabArrays    : 52
    max # of BoxArrays    : 3
    max # of BoxArray uses: 31
### TileArrayCache ###
    tot # of builds  : 155
    tot # of erasures: 155
    tot # of uses    : 164565
    max cache size   : 5
    max # of uses    : 6270
### FBCache ###
    tot # of builds  : 264
    tot # of erasures: 264
    tot # of uses    : 20692
    max cache size   : 14
    max # of uses    : 3148
### CopyCache ###
    tot # of builds  : 205
    tot # of erasures: 205
    tot # of uses    : 745
    max cache size   : 5
    max # of uses    : 57
### FillPatchCache ###
    tot # of builds  : 0
    tot # of erasures: 0
    tot # of uses    : 0
    max cache size   : 0
    max # of uses    : 0
### CrseFineCache ###
    tot # of builds  : 0
    tot # of erasures: 0
    tot # of uses    : 0
    max cache size   : 0
    max # of uses    : 0
MultiFab Tag, current usage and hwm in bytes
All: 0, 265128
Bfield_fp[x][l=0]: 0, 12768
Bfield_fp[y][l=0]: 0, 12672
Bfield_fp[z][l=0]: 0, 13728
Efield_fp[x][l=0]: 0, 13728
Efield_fp[y][l=0]: 0, 13832
Efield_fp[z][l=0]: 0, 12768
current_fp[x][l=0]: 0, 11440
current_fp[y][l=0]: 0, 11528
current_fp[z][l=0]: 0, 10480
phi_fp[l=0]: 0, 11528
rho_fp[l=0]: 0, 27664
MemPool: tot used: 8 MB.
Total GPU global memory (MB) spread across MPI: [40536 ... 40536]
Free  GPU global memory (MB) spread across MPI: [39807 ... 39807]
[The         Arena] space (MB) allocated spread across MPI: [30402 ... 30402]
[The         Arena] space (MB) used      spread across MPI: [0 ... 0]
[The  Pinned Arena] space (MB) allocated spread across MPI: [56 ... 56]
[The  Pinned Arena] space (MB) used      spread across MPI: [0 ... 0]

Compiled with EB support:

### FabArray ###
    tot # of builds       : 65954
    max # of FabArrays    : 121
    max # of BoxArrays    : 2
    max # of BoxArray uses: 53
### TileArrayCache ###
    tot # of builds  : 53
    tot # of erasures: 53
    tot # of uses    : 71366
    max cache size   : 3
    max # of uses    : 8246
### FBCache ###
    tot # of builds  : 111
    tot # of erasures: 111
    tot # of uses    : 34258
    max cache size   : 11
    max # of uses    : 2212
### CopyCache ###
    tot # of builds  : 103
    tot # of erasures: 103
    tot # of uses    : 489
    max cache size   : 3
    max # of uses    : 57
### FillPatchCache ###
    tot # of builds  : 0
    tot # of erasures: 0
    tot # of uses    : 0
    max cache size   : 0
    max # of uses    : 0
### CrseFineCache ###
    tot # of builds  : 0
    tot # of erasures: 0
    tot # of uses    : 0
    max cache size   : 0
    max # of uses    : 0
MultiFab Tag, current usage and hwm in bytes
All: 0, 394752
Bfield_fp[x][l=0]: 0, 12768
Bfield_fp[y][l=0]: 0, 12672
Bfield_fp[z][l=0]: 0, 13728
Efield_fp[x][l=0]: 0, 13728
Efield_fp[y][l=0]: 0, 13832
Efield_fp[z][l=0]: 0, 12768
current_fp[x][l=0]: 0, 11440
current_fp[y][l=0]: 0, 11528
current_fp[z][l=0]: 0, 10480
m_distance_to_eb[l=0]: 0, 13832
m_edge_lengths[x][l=0]: 0, 11440
m_edge_lengths[y][l=0]: 0, 11528
m_edge_lengths[z][l=0]: 0, 10480
m_face_areas[x][l=0]: 0, 10480
m_face_areas[y][l=0]: 0, 10400
m_face_areas[z][l=0]: 0, 11440
phi_fp[l=0]: 0, 11528
rho_fp[l=0]: 0, 27664
MemPool: tot used: 8 MB.
Total GPU global memory (MB) spread across MPI: [40536 ... 40536]
Free  GPU global memory (MB) spread across MPI: [39801 ... 39801]
[The         Arena] space (MB) allocated spread across MPI: [30402 ... 30402]
[The         Arena] space (MB) used      spread across MPI: [0 ... 0]
[The  Pinned Arena] space (MB) allocated spread across MPI: [56 ... 56]
[The  Pinned Arena] space (MB) used      spread across MPI: [0 ... 0]
ax3l commented 2 years ago

Thank you for the test! Looks like there is a significant runtime Fab creation and potential memory overhead that we need to guard by runtime flags.

CCing further developers of EB: @WeiqunZhang @atmyers @lgiacome @RemiLehe et al.

WeiqunZhang commented 2 years ago

We certainly don't need to build these MultiFabs for regular geometry.

m_distance_to_eb
m_edge_lengths
m_face_areas
...
ax3l commented 1 year ago

We first need to fix #3385