llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.6k stars 11.82k forks source link

[OMPT] Segfault in clangs ompt_callback system #55073

Open SpinTensor opened 2 years ago

SpinTensor commented 2 years ago

Hi,

please consider the following test code:

#include <stdio.h>
#include <omp.h>
#include <omp-tools.h>

static void my_ompt_callback_implicit_task(ompt_scope_endpoint_t endpoint,
                                           ompt_data_t *parallel_data,
                                           ompt_data_t *task_data,
                                           unsigned int actual_parallelism,
                                           unsigned int index,
                                           int flags) {
   switch (endpoint) {
      case ompt_scope_begin:
         fprintf(stderr, "implicit_task begin:    t=%d l=%d\n",
                 omp_get_thread_num(), omp_get_level());
         break;
      case ompt_scope_end:
         fprintf(stderr, "implicit_task end:      t=%d l=%d\n",
                 omp_get_thread_num(), omp_get_level());
         break;
      case ompt_scope_beginend:
         fprintf(stderr, "implicit_task beginend: t=%d l=%d\n",
                 omp_get_thread_num(), omp_get_level());
         break;
      default:
         break;
   }
}

// initialize callbacks
int my_ompt_initialize(ompt_function_lookup_t lookup, int initial_device_num,
                       ompt_data_t *tool_data) {
   // Get the set_callback function pointer
   ompt_set_callback_t ompt_set_callback = (ompt_set_callback_t)lookup("ompt_set_callback");
   // register the available callback functions
   ompt_callback_implicit_task_t f_ompt_callback_implicit_task = &my_ompt_callback_implicit_task;
   ompt_set_callback(ompt_callback_implicit_task, (ompt_callback_t)f_ompt_callback_implicit_task);

   return 1; // success: activates tool
}
void my_ompt_finalize(ompt_data_t *tool_data) {
   (void) tool_data;
}
// start tool
ompt_start_tool_result_t *ompt_start_tool(unsigned int omp_version,
                                          const char *runtime_version) {
   static ompt_start_tool_result_t ompt_start_tool_result;
   ompt_start_tool_result.initialize = &my_ompt_initialize;
   ompt_start_tool_result.finalize = &my_ompt_finalize;
   return &ompt_start_tool_result; // success: registers tool
}

int main(int argc, char **argv) {
   #pragma omp parallel num_threads(2)
   {
      int threadnum = omp_get_thread_num();
   }

   return 0;
}

In the code I register the implicit task callback function for the ompt_callback system. Then I compile it with the latest git version of llvm: clang -fopenmp -o test.x test.c

clang --version
clang version 15.0.0 (/home/fuhl/software/llvm_src/clang 34312f1f0c4f56ae78577783ec62bee3fb5dab90)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/bm/llvm-15.0.0/bin
$ ldd test.x 
    linux-vdso.so.1 =>  (0x00007ffcac3a1000)
    libomp.so => /opt/bm/llvm-15.0.0/lib/libomp.so (0x00002b75f1ab4000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b75f1dc2000)
    libc.so.6 => /lib64/libc.so.6 (0x00002b75f1fde000)
    librt.so.1 => /lib64/librt.so.1 (0x00002b75f23ac000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00002b75f25b4000)
    /lib64/ld-linux-x86-64.so.2 (0x00002b75f168d000)

When executing the resulting program I get the following output:

$ ./test.x 
implicit_task begin:    t=0 l=0
implicit_task begin:    t=0 l=1
implicit_task begin:    t=1 l=1
implicit_task end:      t=0 l=1
implicit_task end:      t=0 l=0
Segmentation fault (core dumped)

The segfault can be circumvented if for ompt_scope_end in the implicit_task_callback the omp_get_level() call is omitted.

llvmbot commented 2 years ago

@llvm/issue-subscribers-openmp

Thyre commented 1 year ago

I would not recommend calling OpenMP functions inside OMPT callbacks since this is against the OpenMP specifications:

Section 19.5 of the OpenMP Standard 5.2 specifications states the following restriction:

Tool callbacks may not use OpenMP directives or call any runtime library routines described in Chapter 18.

This includes both omp_get_level() and omp_get_thread_num(). Calling the functions at the wrong time can also have some side effects. For example, calling omp_get_num_devices() before the initialization of the OMPT interface has finished will cause the function to return a value of 0.

SpinTensor commented 1 year ago

Thanks for the hint. I was not aware of this restriction. What would be the recommended way of figuring out which thread (level and ID) issued the callback?

Thyre commented 1 year ago

I don't have a perfect answer for that since I'm also just using the OMPT framework. In our software, we assign a unique thread id when the thead begin callback is called and store this id in a thread local variable. I'm not sure how we're handling the thread level right now.