In this example (https://godbolt.org/z/W4WMPr5xq) the variable x is shared between all the threads by writing its pointer to a global value that is read by all the threads. This should be legal according to OpenMP, but when the variable is placed directly inside the parallel region, rather than inside of a function that's called in parallel, it will not be globalized. When I compile and the first version on my nvptx64 machine I get the following:
Extended Description
In this example (https://godbolt.org/z/W4WMPr5xq) the variable
x
is shared between all the threads by writing its pointer to a global value that is read by all the threads. This should be legal according to OpenMP, but when the variable is placed directly inside the parallel region, rather than inside of a function that's called in parallel, it will not be globalized. When I compile and the first version on my nvptx64 machine I get the following:$ clang++ version1.cpp -fopenmp-targets=nvptx64 -fopenmp $ ./a.out
Thread 0: 1 Thread 1: 1 Thread 2: 1 ... Thread 125: 1 Thread 126: 1 Thread 127: 1
The second version where
x
is directly in the parallel region gives me this:$ clang++ version1.cpp -fopenmp-targets=nvptx64 -fopenmp $ ./a.out
Thread 0: 0 Thread 1: 1 Thread 2: 2 ... Thread 125: 125 Thread 126: 126 Thread 127: 127
A call to
__kmpc_alloc_shared
is not inserted for the variablex
in the second version, leading to the value not being sharable between the threads.