NVIDIA / grcuda

Polyglot CUDA integration for the GraalVM
Other
222 stars 19 forks source link

[BUG] Fortran-order option ineffective for MultiDimDeviceArray #25

Closed muellren closed 4 years ago

muellren commented 4 years ago

When creating a device array in Fortran-order (column-major) using the F option as in DeviceArray("float", 7, 13, 'F'), the data is laid out in C-order (row-major) as with the C option or without specification of an order option.

Reproducer:

const n = 4
const CU = Polyglot.eval('grcuda', 'CU')
const matrixA = CU.DeviceArray('float', n, n, 'F')
for (let i = 0; i < n; i += 1) {
  for (let j = 0; j < n; j += 1) {
    matrixA[i][j] = i + n * j
  }
}

const code = `
__global__ void kernel(const float *arr) {
  int idx = threadIdx.x + blockIdx.x * blockDim.x;
  while (idx < ${n} * ${n}) {
    printf("%d %f\\n", idx, arr[idx]);
    idx += blockDim.x * gridDim.x;
  }
}
`
const kernel = CU.buildkernel(code, 'kernel', 'pointer')
kernel(1, 32)(matrixA)

Outputs

0 0.000000
1 4.000000
2 8.000000
3 12.000000
4 1.000000
5 5.000000
6 9.000000
7 13.000000
8 2.000000
9 6.000000
10 10.000000
11 14.000000
12 3.000000
13 7.000000
14 11.000000
15 15.000000

But should output:

0 0.000000
1 1.000000
2 2.000000
3 3.000000
4 4.000000
5 5.000000
6 6.000000
7 7.000000
8 8.000000
9 9.000000
10 10.000000
11 11.000000
12 12.000000
13 13.000000
14 14.000000
15 15.000000
muellren commented 4 years ago

Fixed in PR #26