Currently there is no support for hierarchical GPU parallelism. There is a ForCollapse method in the CPU API, so it makes sense to add ForCollapse for the upcoming GPU API as an elegant way to support hierarchical parallelism.
The following code should implement a hierarchical matmul:
DotMP.GPU.DataTo(b);
DotMP.GPU.DataToFrom(a);
DotMP.GPU.ParallelForCollapse((0, M), (0, N), (i, j, b, a) =>
{
int total = 0;
for (int k = 0; k < K; i++)
total += a[i, k] * b[k, j];
a[i, j] = total;
});
Currently there is no support for hierarchical GPU parallelism. There is a
ForCollapse
method in the CPU API, so it makes sense to addForCollapse
for the upcoming GPU API as an elegant way to support hierarchical parallelism.The following code should implement a hierarchical matmul:
Additional context Self-assigning this issue.