Closed angus-g closed 1 year ago
On the 800x160x75 tub example, here are the timings for 10 iterations of regridding/remapping for various thread counts on my 16-core (32-thread) workstation. The total time is the whole routine as called from Python, so there's a bit of extra setup. I've also verified that the answer reproduces the single-threaded (pre-OpenMP) results.
threads | regridding | remapping | total |
---|---|---|---|
1 | 17.38 | 10.34 | 27.9 |
2 | 9.46 | 5.28 | 15.0 |
4 | 6.15 | 3.14 | 9.45 |
8 | 4.32 | 1.73 | 6.21 |
16 | 4.48 | 1.14 | 5.78 |
32 | 4.11 | 0.72 | 5.01 |
@AndyHoggANU I have updated the module on gadi if you want to test any of this (also includes the ability to restart from a previous ALE run)
For a fairly large domain, single-core regridding/remapping is a bit slow. We could probably at least use the available cores without yet going for the complexity of MPI.