Closed pohlan closed 3 years ago
Since the ^
operations are so expensive (see discussion in #18), we decided to use an array for d_eff
in order to calculate it only once per grid point (which is not the case in the version of this branch). Thus this branch can be put aside.
Alternative to #14 without introducing another array
▶️
$ julia --project -O3 --check-bounds=no examples/run_SHMIP.jl
withnx = ny = 1024
:(
T_eff = 3.70 GB/s
in main)What is the difference to the main branch?
In main branch version,
d_eff
has to be calculated nine times at each grid point in each iteration:dτ_ϕ
Five out of those nine
d_eff
values correspond to the grid point (ix, iy) whereϕ
is being updated, thus the same calculation is done five times. The otherd_eff
values correspond to the neighbouring points (ix-1, iy), (ix+1, iy), (ix, iy-1), (ix, iy+1), which are called only once by a specific thread (though they are also calculated multiple times by all the neighbouring threads).Here I calculate the central
d_eff
at (ix, iy) as a scalar in each grid point to at least partly reduce the redundancy: https://github.com/pohlan/SheetModel.jl/blob/b2616d580fa61721b2211510b9b859e02fcd9177/src/modelonly.jl#L163-L166 This requires the definition of two differentflux_x
andflux_y
macros: https://github.com/pohlan/SheetModel.jl/blob/b2616d580fa61721b2211510b9b859e02fcd9177/src/modelonly.jl#L86-L91