"You can see that we have a run time of around 1 h 28 min without parallelization and disabled file io and a run time of just 3 min 39 sec with parallelization and enabled file io. This results in a speedup of "
[ ] What about the other core / thread counts?
The tasks asked for using up to 72 threads and what happens when using more threads than cores/
"This could explain why the third pass is faster than the first, where one more socket is used. Therefore, some communication must take place via NUMA during the first run when writing to a file. In the fourth configuration, half of the cores are used, which reduces the parallel performance and increases the runtime."
[ ] Did you compare an implementation with NUMA-aware inits and one without?
" Utilizing the AMReX library and following the outlined steps, this project aims to achieve these goals within a four-week timeframe."
[ ] You are certainly on the ambituous side of things.
"You can see that we have a run time of around 1 h 28 min without parallelization and disabled file io and a run time of just 3 min 39 sec with parallelization and enabled file io. This results in a speedup of "
"This could explain why the third pass is faster than the first, where one more socket is used. Therefore, some communication must take place via NUMA during the first run when writing to a file. In the fourth configuration, half of the cores are used, which reduces the parallel performance and increases the runtime."
" Utilizing the AMReX library and following the outlined steps, this project aims to achieve these goals within a four-week timeframe."