llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.44k stars 11.76k forks source link

Variable not globalized on the device when nested inside parallel region. #50439

Open jhuber6 opened 3 years ago

jhuber6 commented 3 years ago
Bugzilla Link 51095
Version trunk
OS Linux
CC @jdoerfert

Extended Description

In this example (https://godbolt.org/z/W4WMPr5xq) the variable x is shared between all the threads by writing its pointer to a global value that is read by all the threads. This should be legal according to OpenMP, but when the variable is placed directly inside the parallel region, rather than inside of a function that's called in parallel, it will not be globalized. When I compile and the first version on my nvptx64 machine I get the following:

$ clang++ version1.cpp -fopenmp-targets=nvptx64 -fopenmp $ ./a.out

Thread 0: 1 Thread 1: 1 Thread 2: 1 ... Thread 125: 1 Thread 126: 1 Thread 127: 1

The second version where x is directly in the parallel region gives me this:

$ clang++ version1.cpp -fopenmp-targets=nvptx64 -fopenmp $ ./a.out

Thread 0: 0 Thread 1: 1 Thread 2: 2 ... Thread 125: 125 Thread 126: 126 Thread 127: 127

A call to __kmpc_alloc_shared is not inserted for the variable x in the second version, leading to the value not being sharable between the threads.

jhuber6 commented 3 years ago

assigned to @alexey-bataev

jhuber6 commented 1 year ago

This is still failing on main.