Closed AlexanderSinn closed 7 months ago
Thanks for this PR! Is the cleaning specifically for GPU? Some functions may be negligible/predictable on GPU, but significant different on CPU.
Good point. Looking at the profiling regions that were removed for CPU:
Hipace::ResetAllQuantities()
and Hipace::ResetLaser()
are rarely called, omp parallelized and profiled by amrex through FabArray::setVal()
.Fields::LinCombination()
and Fields::Multiply()
are omp parallelized and would now be profiled by the function that uses them e.g. Fields::SolvePoissonPsiExmByEypBxEzBz()
.FFTPoissonSolverPeriodic::define()
etc is replaced by AnyFFT::CreatePlan()
since creating the FFT plan is the slow part on CPU when initializing the poisson solver.AnyDST::ExpandR2R()
etc. is not used on CPU.MultiPlasma::InitData()
is already profiled by the inner function PlasmaParticleContainer::InitParticles()
MultiBuffer::get_time()
and MultiBuffer::put_time()
never take significant time on GPU runs, since they are doing MPI communication unrelated to GPU so I don’t see this being different for CPU simulations.
This PR cleans-up the TinyProfiler by:
HIPACE_DETAIL_PROFILE
as it is usually not used. The functions it is profiling have very predictable performance. Additionally a critical flaw was that the detailed profile regions would still show up in the output with incorrect timing if it was not used.New:
Old:
const
isconst
)