lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
294 stars 100 forks source link

host omp parallel for loop variable #1483

Closed jxy closed 3 months ago

jxy commented 3 months ago

I got an error compiling this code with Intel's icpx compiler, https://github.com/lattice/quda/blob/d199bd36a7024f24c28ad007d540e35aa850b27e/include/targets/generic/block_reduction_kernel_host.h#L8-L11

The error is

error: initialization clause of OpenMP for loop is not in canonical form ('var = init' or 'T var = init')
    9 |     for (block.y = 0; block.y < arg.grid_dim.y; block.y++) {
      |          ^~~~~~~~~~~

I'm not entirely sure whether the code deviates from the standard or if this is a quirk in Intel’s OpenMP implementation. While I can work around the issue easily, I hope raising it here will attract insights from those more knowledgeable.

maddyscientist commented 3 months ago

When I wrote this code, I thought I stuck to the standard. I'm not an OpenMP ninja though.

@jxy feel free to file a PR with a patch that keeps Intel OpenMP happy if you like.

jxy commented 3 months ago

I think this actually creates a race condition, because the threads are updating block. I'll open a PR later today just to move the block creation inside the loop.