Hi-PACE / hipace

Highly efficient Plasma Accelerator Emulation, quasistatic particle-in-cell code
https://hipace.readthedocs.io
Other
54 stars 15 forks source link

Clean-up TinyProfiler output #1087

Closed AlexanderSinn closed 7 months ago

AlexanderSinn commented 8 months ago

This PR cleans-up the TinyProfiler by:

New:

TinyProfiler total time across processes [min...avg...max]: 109.8 ... 115.3 ... 116.4

--------------------------------------------------------------------------------------------------
Name                                               NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
--------------------------------------------------------------------------------------------------
hpmg::MultiGrid::solve2()                            2000      21.04      22.08      23.58  20.25%
hpmg::MultiGrid::solve1()                            2000      20.74      21.34      22.03  18.92%
AnyDST::Execute()                                   12000      18.69       19.2      20.32  17.45%
MultiBuffer::get_data()                              2000    0.00241       8.22      17.23  14.80%
ExplicitDeposition()                                 2000       10.1      10.25      10.49   9.01%
Fields::ShiftSlices()                                2000      1.126      3.857      9.757   8.38%
MultiLaser::ShiftLaserSlices()                       2000      4.615      6.165      8.981   7.71%
AdvancePlasmaParticles()                             2000      7.112      7.288      7.883   6.77%
main()                                                  1   0.001265     0.7609      5.988   5.14%
DepositCurrent_PlasmaParticleContainer()             2001      4.677      4.858      5.418   4.65%
MultiLaser::AdvanceSliceMG()                         2000      2.212      2.571      2.749   2.36%
Fields::InitializeSlices()                           2000      1.788      1.894      2.057   1.77%
MultiBuffer::put_data()                              2000   0.003416      1.098      2.012   1.73%
FFTPoissonSolverDirichlet::SolvePoissonEquation()    6000      1.743      1.801      1.943   1.67%
MultiLaser::InitLaserSlice()                          250          0     0.1995      1.596   1.37%
Fields::SolvePoissonPsiExmByEypBxEzBz()              2000      1.306      1.354      1.463   1.26%
Hipace::InitializeSxSyWithBeam()                     2000      0.851     0.9134      1.228   1.05%
Fields::AddRhoIons()                                 2000     0.6711     0.7171     0.8688   0.75%
AnyDST::CreatePlan()                                    1      0.648      0.651     0.6528   0.56%
PlasmaParticleContainer::InitParticles()                1    0.02109    0.02206    0.02289   0.02%
Hipace::SolveOneSlice()                              2000    0.01718     0.0178    0.01929   0.02%
AdvanceBeamParticlesSlice()                          2000    0.01409    0.01476    0.01655   0.01%
Hipace::InitData()                                      1   0.004214    0.01148    0.01476   0.01%
FabArray::setVal()                                      6    0.01167    0.01229    0.01334   0.01%
DepositCurrentSlice_BeamParticleContainer()          4000   0.009248   0.009908    0.01234   0.01%
Hipace::ExplicitMGSolveBxBy()                        2000   0.007921   0.008593   0.009203   0.01%
shiftSlippedParticles()                                51   0.005341   0.005903   0.006865   0.01%
Hipace::Evolve()                                        1    0.00183   0.003323   0.006159   0.01%
Fields::AllocData()                                     1   0.005244   0.005652   0.006048   0.01%
BeamParticleContainer::InitBeamFixedWeight3D()          1   9.91e-07  0.0005189   0.004142   0.00%
sortBeamParticlesByBox()                                0          0  0.0005072   0.004058   0.00%
BeamParticleContainer::InitBeamFixedWeightSlice()     250          0  0.0005063   0.004051   0.00%
BeamParticleContainer::resize()                      5801   0.001978    0.00263   0.003073   0.00%
MultiLaser::InitSliceEnvelope()                       250          0   0.000199   0.001592   0.00%
MultiLaser::InitData()                                  1  0.0003237  0.0003341  0.0003557   0.00%
ParticleContainer::clearParticles()                     1   3.11e-07   4.32e-07   5.51e-07   0.00%
--------------------------------------------------------------------------------------------------

Old:

TinyProfiler total time across processes [min...avg...max]: 109.9 ... 115.4 ... 116.5

--------------------------------------------------------------------------------------------------
Name                                               NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
--------------------------------------------------------------------------------------------------
hpmg::MultiGrid::solve2()                            2000      21.03      22.02      23.93  20.54%
hpmg::MultiGrid::solve1()                            2000      20.95       21.6      22.09  18.96%
AnyDST::Execute()                                   12000       17.3      17.81      18.91  16.23%
MultiBuffer::get_data()                              2000   0.002554      8.126      17.25  14.81%
ExplicitDeposition()                                 2000       10.1      10.28      10.75   9.23%
MultiLaser::ShiftLaserSlices()                       2000      4.693      6.152      9.013   7.74%
Fields::ShiftSlices()                                2000      1.126      3.846      8.456   7.26%
AdvancePlasmaParticles()                             2000       7.11      7.266      7.696   6.60%
main()                                                  1   0.001299     0.7584      5.992   5.14%
DepositCurrent_PlasmaParticleContainer()             2001      4.679      4.904      5.505   4.73%
MultiLaser::AdvanceSliceMG()                         2000      2.212      2.544      2.643   2.27%
Fields::InitializeSlices()                           2000      1.825      1.902      2.038   1.75%
FFTPoissonSolverDirichlet::SolvePoissonEquation()    6000      1.739      1.805      1.954   1.68%
MultiLaser::InitLaserSlice()                          250          0        0.2        1.6   1.37%
MultiBuffer::put_data()                              2000   0.003203     0.9842      1.582   1.36%
Hipace::InitializeSxSyWithBeam()                     2000     0.8515     0.9102      1.211   1.04%
Fields::AddRhoIons()                                 2000     0.6713     0.7242     0.9411   0.81%
Fields::LinCombination()                             4000     0.6844     0.7093     0.7759   0.67%
AnyDST::CreatePlan()                                    1     0.6487     0.6506     0.6522   0.56%
AnyDST::C2Rfft()                                    24000     0.4175      0.476      0.504   0.43%
AnyDST::Transpose()                                 24000     0.4655     0.4724     0.4875   0.42%
Fields::SolvePoissonPsiExmByEypBxEzBz()              2000     0.3932     0.4002     0.4155   0.36%
AnyDST::ToSine()                                    24000     0.3809     0.4008      0.413   0.35%
Fields::Multiply()                                   2000     0.2422     0.2579     0.3215   0.28%
AnyDST::ToComplex()                                 24000    0.05091    0.06682    0.09276   0.08%
PlasmaParticleContainer::InitParticles                  1    0.02198    0.02295    0.02385   0.02%
Hipace::SolveOneSlice()                              2000    0.01953    0.02035    0.02119   0.02%
FabArray::setVal()                                      6    0.01205    0.01301    0.01553   0.01%
AdvanceBeamParticlesSlice()                          2000      0.014    0.01462    0.01537   0.01%
Hipace::InitData()                                      1   0.005058    0.01138    0.01527   0.01%
DepositCurrentSlice_BeamParticleContainer()          4000   0.009407       0.01    0.01233   0.01%
Hipace::ExplicitMGSolveBxBy()                        2000   0.008403   0.008898   0.009568   0.01%
shiftSlippedParticles()                              2000   0.006138   0.006637   0.006949   0.01%
FFTPoissonSolverDirichlet::define()                     1   0.005005   0.005372    0.00589   0.01%
Hipace::Evolve()                                        1   0.001697   0.003172   0.005573   0.00%
BeamParticleContainer::InitBeamFixedWeight3D()          1  1.042e-06  0.0005359   0.004278   0.00%
sortBeamParticlesByBox()                                0          0  0.0005204   0.004163   0.00%
BeamParticleContainer::InitBeamFixedWeightSlice()     250          0   0.000514   0.004112   0.00%
FabArray::FillBoundary()                             8000   0.002637   0.002907   0.003079   0.00%
BeamParticleContainer::resize()                      5801   0.002049   0.002752    0.00295   0.00%
Fields::SetBoundaryCondition()                       6000   0.002234   0.002387   0.002592   0.00%
FillBoundary_nowait()                                8000   0.002006   0.002186   0.002437   0.00%
MultiLaser::InitSliceEnvelope()                       250          0   0.000243   0.001944   0.00%
BeamParticleContainer::intializeSlice()               250          0  0.0002217   0.001774   0.00%
FabArrayBase::getFB()                                8000    0.00136   0.001486   0.001577   0.00%
PlasmaParticleContainer::IonizationModule()          2000   0.001057   0.001114    0.00122   0.00%
FillBoundary_finish()                                8000   0.001039   0.001099   0.001203   0.00%
BeamParticleContainer::ReorderParticles()            2000  0.0008475  0.0008951  0.0009854   0.00%
GridCurrent::DepositCurrentSlice()                   2000  0.0008023  0.0008713  0.0009802   0.00%
PlasmaParticleContainer::ReorderParticles()          2000  0.0007612  0.0008351  0.0008909   0.00%
MultiBuffer::get_time()                                 0          0  0.0001982  0.0004981   0.00%
MultiLaser::InitData()                                  1  0.0003231  0.0003306  0.0003394   0.00%
Fields::AllocData()                                     1  8.261e-05  0.0001152  0.0001396   0.00%
MultiPlasma::InitData()                                 1  1.641e-05  1.927e-05  3.256e-05   0.00%
FabArrayBase::FB::FB()                                  1  2.546e-05  2.997e-05  3.217e-05   0.00%
MultiBuffer::put_time()                                 0          0  1.434e-05  2.809e-05   0.00%
Diagnostic::ResizeFDiagFAB()                            1   5.04e-06  5.324e-06  5.691e-06   0.00%
Hipace::ResetAllQuantities()                            1  1.393e-06  1.576e-06  1.964e-06   0.00%
Hipace::ResetLaser()                                    1    6.6e-07  9.558e-07  1.323e-06   0.00%
ParticleContainer::clearParticles()                     1   3.21e-07  3.955e-07   6.32e-07   0.00%
--------------------------------------------------------------------------------------------------
MaxThevenet commented 7 months ago

Thanks for this PR! Is the cleaning specifically for GPU? Some functions may be negligible/predictable on GPU, but significant different on CPU.

AlexanderSinn commented 7 months ago

Good point. Looking at the profiling regions that were removed for CPU: