SmileiPIC / Smilei

Particle-in-cell code for plasma simulation
https://smileipic.github.io/Smilei
340 stars 120 forks source link

Title: Performance degradation and error when using PrescribedField in 3D cartesian geometry #757

Open SeaWolfYFZ opened 1 week ago

SeaWolfYFZ commented 1 week ago

Description

  1. Significant Performance Degradation When Using PrescribedField Module. I am experiencing performance degradation when using the PrescribedField module to simulate an external uniform magnetic field in a 3D cartesian geometry setup with Smilei. Specifically, when I use the following configuration:
PrescribedField(
    field = "Bz_m",
    profile = lambda x,y,z,t: 2e-2 / B_0,
)

The simulation runs significantly slower, and the CPU usage does not increase as expected, suggesting that numpy array operations are not being utilized efficiently for updating the 3D electromagnetic grid.

Computational Speed with and without PrescribedField Module:

  1. Error When Using Pre-defined Spatial Profiles with PrescribedField Module. I attempt to use a Pre-defined spatial profiles, like so:
PrescribedField(
    field = "Bz_m",
    profile = constant(2e-2 / B_0),
)

I encounter an error indicating that:

 [ERROR](0) src/Profiles/Profile.cpp:33 (Profile) Profile `PrescribedField[0].profile`: constant() profile defined only in 1D, 2D or 3D

Steps to reproduce the problem

Segmentation Fault with ParticleInjector Module: Additionally, when using the ParticleInjector module in Ubuntu 24.04.1 LTS with the default Python 3.12.3 installation, I encounter a segmentation fault. To resolve this, I installed a lower version of Python using Miniconda in a virtual environment.

  1. Following the documentation, I first installed the necessary compilers and libraries:

    sudo apt install git build-essential gcc libhdf5-openmpi-dev
  2. Then, I created a virtual environment using Miniconda:

    conda create -n smilei_py309 python=3.9 numpy scipy matplotlib
  3. Next, I configured the environment:

    conda activate smilei_py309
    export PYTHONEXE=/path/to/miniconda3/envs/smilei_py309/bin/python3
    export LD_LIBRARY_PATH=/path/to/miniconda3/envs/smilei_py309/lib:${LD_LIBRARY_PATH}
    export HDF5_ROOT_DIR=/usr/lib/x86_64-linux-gnu/hdf5/openmpi
  4. Finally, I compiled the code:

    make -j 28
  5. Run the code:

    ./smilei ./sim_namelist.py

Parameters

Namelist

import math 
import numpy as np
import scipy.constants 

# -------- Physical constant (SI) -------------------------------------------------------------------------------
##### Physical constants
pi                  = scipy.constants.pi
wavelength          = 4.0                        # reference length, m - here it will be our plasma wavelength
c_lightspeed        = scipy.constants.c             # lightspeed, m/s
omega               = 2 * pi * c_lightspeed / wavelength         # reference angular frequency, rad/s
eps0                = scipy.constants.epsilon_0     # Vacuum permittivity, F/m
qe                  = scipy.constants.e             # Elementary charge, C
me                  = scipy.constants.m_e           # Electron mass, kg
electron_mass_MeV   = scipy.constants.physical_constants["electron mass energy equivalent in MeV"][0]
# ------------------------------------------------------------------------------------------------------------

omega_normalized = 1.0

# -------- Conversion of international and normalized units -----------------------------------------------------------------
omega_0     = omega / omega_normalized
T_0         = 1 / omega_0
L_0         = c_lightspeed / omega_0
E_0         = me * c_lightspeed * omega_0 / qe
B_0         = me * omega_0 / qe
N_0         = eps0 * me * omega_0**2 / qe**2
# -------------------------------------------------------------------------------------------------

##### Mesh and time evolution
# dx                  = (lambda0/1e-6)*um/30      # longitudinal resolution
# dr                  = 5.0*um                    # radial resolution
dx_normalized                  = 0.2e-3 / L_0       # longitudinal resolution
dr_normalized                  = 0.2e-3 / L_0                     # radial resolution

nx                  = 512                      # number of grid points in the longitudinal direction
nr                  = 64                       # number of grid points in the radial direction

Lx_normalized                  = nx * dx_normalized                   # longitudinal size of the simulation window
Lr_normalized                  = nr * dr_normalized                   # radial size of the simulation window, which goes from r=0 to r=Lr

npatch_x            = 32
npatch_r            = 4

# dt                  = 0.98*dx/c_normalized      # integration timeestep
dt_normalized                  = (40 / 99) * dx_normalized      # integration timeestep
Niterations         = 2000                      

diagsetp_field = 100 # Step for saveing data
diagsetp_probe = 100 # Step for saveing data

##### Boundary conditions
EM_boundary_conditions  = [["silver-muller"]]

##### B-TIS3 interpolation to cope with numerical Cherenkov radiation
use_BTIS3_interpolation = False

Main(
    geometry                       = "3Dcartesian",
    number_of_AM                   = 3,
    interpolation_order            = 2,
    timestep                       = dt_normalized,
    simulation_time                = (Niterations+1)*dt_normalized, 
    cell_length                    = [  dx_normalized, 
                                        dr_normalized,
                                        dr_normalized
                                    ],
    grid_length                    = [  Lx_normalized, 
                                        Lr_normalized,
                                        Lr_normalized
                                    ],
    number_of_patches              = [npatch_x, npatch_r, npatch_r],
    EM_boundary_conditions         = EM_boundary_conditions,
    # number_of_pml_cells            = [
    #                                     [0 ,0 ],
    #                                     [1 ,12]
    #                                 ],       
    solve_poisson                  = False,
    print_every                    = 10,
    use_BTIS3_interpolation        = use_BTIS3_interpolation,
    reference_angular_frequency_SI = omega,
)

LoadBalancing(
    initial_balance = True,
    every = 100,
    cell_load = 1.,
    frozen_particle_load = 0.1
)

density_beam_normalized = 1e14 / N_0
vbeam_normalized = 0.99
wbr_normalized = 1e-3 / L_0
wbx_normalized = 2.5e-3 / L_0

np_percell = 64.0
# particle_BC = [["remove", "remove"],["reflective", "remove"],]
particle_BC = [
    ["remove", "remove"],
    ["remove", "remove"],
    ["remove", "remove"],
    ] 

a_normalized    = 4e2 / B_0
w0_normalized   = 50e-6 / L_0
fx_normalized   = 10e-3 / L_0
fy_normalized   = Lr_normalized * 0.5
fz_normalized   = Lr_normalized * 0.5

k_normalized    = omega_normalized
z0_normalized   = k_normalized * w0_normalized**2 / 2.0
wx_normalized   = w0_normalized * np.sqrt(1+(fx_normalized/z0_normalized)**2)
# Rx_normalized   = -f_normalized*(1+(z0_normalized/f_normalized)**2)
one_over_Rx_normalized = -fx_normalized / (fx_normalized**2 + z0_normalized**2)

af_normalized = a_normalized * w0_normalized / wx_normalized

laser_fwhm      = 30*1e-15/T_0
time_envelope   = tgaussian(duration = 2.0*laser_fwhm, center=1.0*laser_fwhm, fwhm=laser_fwhm)

def mynbeam(x,y,z):
    ry = y - fy_normalized
    rz = z - fz_normalized
    r2 = (ry)**2 + (rz)**2
    return density_beam_normalized * np.exp(-r2 / wbr_normalized**2) * np.heaviside((3.0*wbr_normalized)**2 - r2, 0.0) * np.exp(-x**2 / wbx_normalized**2) * np.heaviside(3.0*wbx_normalized - x, 0.0)

# def mynbeam_AM(x,r):
#     return density_beam_normalized * np.exp(-(r/wbr_normalized)**2) * np.heaviside(3.0*wbr_normalized - r, 0.0)
if (use_BTIS3_interpolation == False):
    pusher = "boris"
else:
    pusher = "borisBTIS3"
Species( 
    name                    = "ebeam",
    position_initialization = "random",
    momentum_initialization = "cold",
    particles_per_cell      = np_percell,
    regular_number          = [], # i.e. 2 macro-particles along x, 1 macro-particle along r and 8 along the theta angle between 0 and 2*pi
    c_part_max              = 1.0,
    mass                    = 1.0,
    charge                  = -1.0,
    charge_density          = mynbeam,
    mean_velocity           = [vbeam_normalized, 0.0, 0.0],
    pusher                  = pusher,  
    time_frozen             = 0.0 ,
    boundary_conditions     = particle_BC,
)

ParticleInjector(
    species = 'ebeam',
    box_side = 'xmin',
)

PrescribedField(
    field = "Bz_m",
    profile = lambda x,y,z,t: 2e-2 / B_0,
)

# PrescribedField(
#     field = "Bz_m",
#     profile = constant(5e-2 / B_0),
# )

list_fields_field_diagnostic = ['Ex', 'Ey', 'Ez', 'Bx', 'By', 'Bz']
list_fields_field_diagnostic.extend(['Rho_ebeam'])

DiagFields(
    every      = diagsetp_field,
    fields     = list_fields_field_diagnostic
)

DiagScalar(
    every = diagsetp_probe ,
    vars = ["Utot", "Ukin", "Uelm"],
)

origin_2D_probes_xy     =   [ 0.0           , 0.0           , Lr_normalized * 0.5 ]
corners_2D_probes_xy    = [
                            [ Lx_normalized , 0.0           , Lr_normalized * 0.5 ],   
                            [ 0.0           , Lr_normalized , Lr_normalized * 0.5 ],   
    ]

origin_2D_probes_xz     =   [ 0.0           , Lr_normalized * 0.5  , 0.0               ]
corners_2D_probes_xz    = [
                            [ Lx_normalized , Lr_normalized * 0.5  , 0.0               ],
                            [ 0.0           , Lr_normalized * 0.5  , Lr_normalized     ],
    ]

x_probe_1 = 64.0*dx_normalized
x_probe_2 =   0.0*dx_normalized
origin_2D_probes_yz_1   =   [ x_probe_1     , 0.0           , 0.0           ]
corners_2D_probes_yz_1  = [
                            [ x_probe_1     , Lr_normalized , 0.0           ],
                            [ x_probe_1     , 0.0           , Lr_normalized ],
    ]

origin_2D_probes_yz_2   =   [ x_probe_2     , 0.0           , 0.0           ]
corners_2D_probes_yz_2  = [
                            [ x_probe_2     , Lr_normalized , 0.0           ],
                            [ x_probe_2     , 0.0           , Lr_normalized ],
    ]

list_fields_probes_diagnostic = ['Ex','Ey','Ez','Bx','By','Bz']
# list_fields_probes_diagnostic.extend(['Rho_electron', 'Rho_ion', 'Jx_electron', 'Jy_electron', 'Jz_electron'])
list_fields_probes_diagnostic.extend(['Rho_ebeam', 'Jx_ebeam', 'Jy_ebeam', 'Jz_ebeam'])
DiagProbe(
    every      = diagsetp_probe,
    origin     = origin_2D_probes_xy,
    corners    = corners_2D_probes_xy,
    number     = [nx, nr],
    fields     = list_fields_probes_diagnostic,
)

DiagProbe(
    every      = diagsetp_probe,
    origin     = origin_2D_probes_xz,
    corners    = corners_2D_probes_xz,
    number     = [nx, nr],
    fields     = list_fields_probes_diagnostic,
)

DiagProbe(
    every      = diagsetp_probe,
    origin     = origin_2D_probes_yz_1,
    corners    = corners_2D_probes_yz_1,
    number     = [nr, nr],
    fields     = list_fields_probes_diagnostic,
)

DiagProbe(
    every      = diagsetp_probe,
    origin     = origin_2D_probes_yz_2,
    corners    = corners_2D_probes_yz_2,
    number     = [nr, nr],
    fields     = list_fields_probes_diagnostic,
)

# Checkpoints(
#     # restart_dir = "dump1",
#     dump_step = 0,
#     dump_minutes = 60.,
#     exit_after_dump = False,
#     keep_n_dumps = 2,
# )
SeaWolfYFZ commented 1 week ago

截图 2024-10-24 14-45-21

CPU usage with PrescribedField Module

mccoys commented 1 week ago

Hi

The prescribed fields are known to have bad performance. This is not a bug but related to the fact that the whole profile must be retrieved at every timestep. Maybe we will someday parallelize the python profile calculation, but this is a longer term work, with a wider scope than prescribed fields.

In your case the performance is already poor, even without prescribed fields. I advise that you first try to improve performance in general, then make a prescribed profile that is numpy-vectorisable

Now, your error with the constant profile is not a bug. This is a spatial profile but prescribed fields require a spatio-temporal profile.

Your segfault with interpolator is strange. Do you mean that it does not occur with python 3.9? Can you provide the text of this error?