Each GPU requires knowledge about the model space coordinate of its threads.
For instance if a boundary condition (for simplicity assume a dirichlet's one) has to be applied the following code can't work withouth each GPU knowing the global offset (which actual part of the model it's taking care of).
calclThreadCheck2D();
int i = calclGlobalRow()+borderSize;
int j = calclGlobalColumn();
....
if(i > 1 && i < ROWS-1 && j > 1 && j < COLS-1 )
applyBoundaryConditions();
when this code is executed on more than one GPU the boundary conditions can't be correctly applied since i is always in the range [0-W(gpu_i)<ROWS] where (W(index_gpu) is the workload assigned to gpu numbeer i. . So the boundary on the topmost border would be applied by ALL GPUs and bottomost never applied.
The offset corrensponding to the coordinate of the first row of the model space the GPU is taking care of has to be passed to kernels.
calclThreadCheck2D();
int i = calclGlobalRow()+borderSize;
int j = calclGlobalColumn();
int modelspace_i = computeGlobalcoordinate(i, gpu_ith_offset)
....
if(modelspace_i > 1 && modelspace_i < ROWS-1 && j > 1 && j < COLS-1 )
applyBoundaryConditions();
Note that computeGlobalcoordinate function implementation is trivial: (i+offset). i.e. local coordinate + model space offset
Each GPU requires knowledge about the model space coordinate of its threads.
For instance if a boundary condition (for simplicity assume a dirichlet's one) has to be applied the following code can't work withouth each GPU knowing the global offset (which actual part of the model it's taking care of).
when this code is executed on more than one GPU the boundary conditions can't be correctly applied since
i
is always in the range[0-W(gpu_i)<ROWS]
where (W(index_gpu) is the workload assigned to gpu numbeer i. . So the boundary on the topmost border would be applied by ALL GPUs and bottomost never applied.The offset corrensponding to the coordinate of the first row of the model space the GPU is taking care of has to be passed to kernels.
Note that
computeGlobalcoordinate
function implementation is trivial:(i+offset)
. i.e. local coordinate + model space offset