llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.96k stars 11.94k forks source link

Some application variables in OpenMP outlined regions are marked artificial #107125

Open jprotze opened 2 months ago

jprotze commented 2 months ago

Debuggers use the artificial attribute to identify compiler generated variables. If the attribute is used consistently, the debugger can show application variables while hiding artificial variables. In outlined functions, clang marks all variables as artificial:

#include <omp.h>
#include <stdio.h>

int main(int argc, char **argv) {
  const int n = 100 * argc;
  double a[n], total=42., c = .3;
#pragma omp parallel for reduction(+ : total) 
  for (int i = 0; i < n; i++) {
    total += a[i] = i * c;
  }
  printf("total=%lf, expected:%lf, a[50]=%lf\n", total, c * n * (n - 1) / 2, a[50]);
}

compiled as:

clang -g -fopenmp test-dwarf.c
llvm-dwarfdump

provides this output:

0x0000009e:   DW_TAG_subprogram
                DW_AT_name      ("main.omp_outlined_debug__")
...
0x00000102:     DW_TAG_variable
                  DW_AT_location        (DW_OP_fbreg -68)
                  DW_AT_name    ("i")
                  DW_AT_type    (0x000001a6 "int")
                  DW_AT_artificial      (true)
...
0x00000134:     DW_TAG_variable
                  DW_AT_location        (DW_OP_fbreg -96)
                  DW_AT_name    ("total")
                  DW_AT_type    (0x000001bd "double")
                  DW_AT_artificial      (true)

i is clearly a local variable and there is no argument for marking it artificial. total is also an application variable and should not be marked artifical. Other variables like a and c are not marked artificial.

llvmbot commented 2 months ago

@llvm/issue-subscribers-debuginfo

Author: Joachim (jprotze)

Debuggers use the artificial attribute to identify compiler generated variables. If the attribute is used consistently, the debugger can show application variables while hiding artificial variables. In outlined functions, clang marks all variables as artificial: ```C #include <omp.h> #include <stdio.h> int main(int argc, char **argv) { const int n = 100 * argc; double a[n], total=42., c = .3; #pragma omp parallel for reduction(+ : total) for (int i = 0; i < n; i++) { total += a[i] = i * c; } printf("total=%lf, expected:%lf, a[50]=%lf\n", total, c * n * (n - 1) / 2, a[50]); } ``` compiled as: ```bash clang -g -fopenmp test-dwarf.c llvm-dwarfdump ``` provides this output: ``` 0x0000009e: DW_TAG_subprogram DW_AT_name ("main.omp_outlined_debug__") ... 0x00000102: DW_TAG_variable DW_AT_location (DW_OP_fbreg -68) DW_AT_name ("i") DW_AT_type (0x000001a6 "int") DW_AT_artificial (true) ... 0x00000134: DW_TAG_variable DW_AT_location (DW_OP_fbreg -96) DW_AT_name ("total") DW_AT_type (0x000001bd "double") DW_AT_artificial (true) ``` `i` is clearly a local variable and there is no argument for marking it artificial. `total` is also an application variable and should not be marked artifical. Other variables like `a` and `c` are not marked artificial.
jmorse commented 1 month ago

I'm not an openmp expert but took a look; adding -emit-llvm -S -Xclang -disable-llvm-passes shows that these variables are created Artifical in clang, which suggests that this isn't occurring by accident, and it's probably the intention of clang/openmp to produce artificial variables.

Could you elaborate on the use case for not marking these variables artificial (they have after all been transformed) -- i.e. is there a particular debugger that won't show you relevant information, that can't otherwise be recovered? (This'll clarify whether there are mismatched expectations or whether it's a bug).

jprotze commented 1 month ago

Putting a breakpoint to the body of the for-loop of my code example, gdb reports all local variables, including all compiler-generated variables:

$ OMP_NUM_THREADS=4 gdb ./a.out
(gdb) start
(gdb) break 11 thread 1
(gdb) c
(gdb) c
(gdb) c
Thread 1 "a.out" hit Breakpoint 2, main.omp_outlined_debug__ (.global_tid.=0x7ffffffe1280, .bound_tid.=0x7ffffffe1278, total=@0x7ffffffe19c0: 42, n=@0x7ffffffe19dc: 100, vla=100, 
    a=@0x7ffffffe1690: 0, c=@0x7ffffffe19b8: 0.29999999999999999) at test-dwarf.c:11
11      total += a[i] = i * c;
(gdb) info locals 
.omp.iv = 4
.capture_expr. = 100
i = 0
.omp.ub = 24
.omp.stride = 100
.omp.lb = 0
.omp.is_last = 0
total = 1.7999999999999998

For this small test case already a lot of variables are introduced and all are reported by gdb. Notably, gdb prints the thread-local value of total. The output shows actually a few more issues that I planned to report in separate issues.

Totalview by default does not show variables marked as artificial to allow focusing on the actual application variables. At the same breakpoint as above, Totalview only shows some of the function parameters (which are the only variables not marked as artificial):

double& total=42
const int& n=0x64=100
double& a=0
double& c=0.3

The types don't match the application semantics and I think with the right dwarf information, the debugger should be able to display the right type (I'll create a separate issue for this).

From a programmers perspective I'd like to be able to see the local values of the variables according to OpenMP semantics:

double total=0.9
int i=3
double a[100]=...

According to OpenMP semantics, total is private to the thread, 0-initialized and contains the partial sum of the thread (see gdb output). Since the variable is marked artificial, it's hidden in Totalview. The iteration variable i is also private to each thread. In the generated code, the variable is actually stored in .omp.iv. The dwarf information should point the local instance of i to the location of .omp.iv. (I'll create a separate issue for this) The array a is shared between the threads. With the right type information for a (see parameter discussion above), the array should be accessible for display rather than just the first element.

@alexey-bataev suggested to file this issue and probably he can comment more from the perspective of the OpenMP codegen.