llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.65k stars 11.84k forks source link

[OpenMP][OMPT] Controlling a tool via `omp_control_tool` might incorrectly return that there is no tool #112924

Open Thyre opened 5 days ago

Thyre commented 5 days ago

The OpenMP interface offers a simple function to communicate with an attached tool, which can be just information dumping, sanitizer like Archer or performance tools like Score-P and TAU. While playing around with omp_control_tool and its counterpart ompt_callback_control_tool I noticed that the runtime might return omp_control_tool_notool, even though a tool was just initialized. This mainly affects offloading constructs and calling omp_control_tool as its first action.

My testing was unfortunately only done with LLVM 18.1.8. With LLVM trunk and an AMD GPU, trying to run any OpenMP example with offload fails with an error message. I'm building LLVM 19.1.2 right now to test against.

Here's a simple reproducer:

#include <omp.h>
#include <omp-tools.h>
#include <stdio.h>

/* TOOL CODE */

int
callback_control_tool( uint64_t    command,
                       uint64_t    modifier,
                       void*       arg,
                       const void* codeptr_ra )
{
    const char* command_name;

    switch ( command )
    {
        case omp_control_tool_start:
        {
            command_name = "omp_control_tool_start";
            break;
        }
        case omp_control_tool_flush:
        {
            command_name = "omp_control_tool_flush";
            break;
        }
        case omp_control_tool_pause:
        {
            command_name = "omp_control_tool_pause";
            break;
        }
        case omp_control_tool_end:
        {
            command_name = "omp_control_tool_end";
            break;
        }
        default:
            command_name = NULL;
            break;
    }

    printf( "[%s] command = %lu | modifier = %lu | arg = %p | codeptr_ra = %p\n",
            __FUNCTION__,
            command,
            modifier,
            arg,
            codeptr_ra );

    if( command_name )
    {
        return omp_control_tool_success;
    }
    return omp_control_tool_ignored;
}

int
tool_initialize( ompt_function_lookup_t lookup,
                 int                    initial_device_num,
                 ompt_data_t*           tool_data )
{
    ompt_set_callback_t set_callback = ( ompt_set_callback_t )lookup( "ompt_set_callback" );  
    if( !set_callback )
    {
        return 0;
    }
    ompt_set_result_t result = set_callback( ompt_callback_control_tool, (ompt_callback_t)&callback_control_tool );
    if( result != ompt_set_always )
    {
        return 0;
    }

    printf( "[%s] Successfully initialized tool.\n", __FUNCTION__ );
    return 1; /* non-zero indicates success */
}

void
tool_finalize( ompt_data_t* tool_data )
{
    printf( "[%s] Finalized tool.\n", __FUNCTION__ );
}

ompt_start_tool_result_t *
ompt_start_tool( unsigned int omp_version,
                 const char*  runtime_version )
{
    static ompt_start_tool_result_t result = { &tool_initialize, &tool_finalize, {} };
    return &result;
}

/* USER CODE */

/* If the OMPT interface state is inactive, the OpenMP implementation returns
 * omp_control_tool_notool. If the OMPT interface state is active, but no callback is
 * registered for the tool-control event, the OpenMP implementation returns
 * omp_control_tool_nocallback. An OpenMP implementation may return other
 * implementation-defined negative values strictly smaller than -64; an application may assume that
 * any negative return value indicates that a tool has not received the command. A return value of
 * omp_control_tool_success indicates that the tool has performed the specified command. A
 * return value of omp_control_tool_ignored indicates that the tool has ignored the specified
 * command. A tool may return other positive values strictly greater than 64 that are tool-defined.*/
void
print_tool_status( int controlToolReturnVal )
{
    switch( controlToolReturnVal )
    {
        case omp_control_tool_notool:
            printf("Tried to control tool. Result = no tool\n");
            break;
        case omp_control_tool_nocallback:
            printf("Tried to control tool. Result = no callback\n");
            break;
        case omp_control_tool_success:
            printf("Tried to control tool. Result = success\n");
            break;
        case omp_control_tool_ignored:
            printf("Tried to control tool. Result = ignored\n");
            break;
    }
}

int
main( void )
{
    #ifdef TARGET_REGION
    #pragma omp target teams num_teams( 2 )
    {
        printf( "Hello World from accelerator team %d\n", omp_get_team_num() );
    }
    #endif

    #ifdef PARALLEL_REGION
    #pragma omp parallel num_threads( 2 )
    {
        #pragma omp critical
        printf( "Hello World from thread %d\n", omp_get_thread_num() );
    }
    #endif

    print_tool_status( omp_control_tool( omp_control_tool_start, 1, NULL ) );
    print_tool_status( omp_control_tool( omp_control_tool_flush, 2, NULL ) );
    print_tool_status( omp_control_tool( omp_control_tool_pause, 3, NULL ) );
    print_tool_status( omp_control_tool( omp_control_tool_end, 4, NULL ) );

    return 0;
}

The program tries to call omp_control_tool with the four specified options. Additional ones can be defined by a tool. The user code prints the result of omp_control_tool, the tool prints which command was used. Via defines, one can enable a target region and a parallel region.

Without any OpenMP directives

With LLVM 18.1.8, I get these results:

$ clang -fopenmp reproducer.c && ./a.out
[tool_initialize] Successfully initialized tool.
Tried to control tool. Result = no tool
Tried to control tool. Result = no tool
Tried to control tool. Result = no tool
Tried to control tool. Result = no tool
[tool_finalize] Finalized tool.

This is broken. Even though a tool was initialized, the user code never gets that information. Once the first parallel region was created we finally get the information that a tool is attached and working.

int
main( void )
{
    #ifdef TARGET_REGION
    #pragma omp target teams num_teams( 2 )
    {
        printf( "Hello World from accelerator team %d\n", omp_get_team_num() );
    }
    #endif

    #ifdef PARALLEL_REGION
    #pragma omp parallel num_threads( 2 )
    {
        #pragma omp critical
        printf( "Hello World from thread %d\n", omp_get_thread_num() );
    }
    #endif

    print_tool_status( omp_control_tool( omp_control_tool_start, 1, NULL ) );
    print_tool_status( omp_control_tool( omp_control_tool_flush, 2, NULL ) );
    print_tool_status( omp_control_tool( omp_control_tool_pause, 3, NULL ) );
    print_tool_status( omp_control_tool( omp_control_tool_end, 4, NULL ) );

    #pragma omp parallel
    {}

    print_tool_status( omp_control_tool( omp_control_tool_end, 4, NULL ) );
    return 0;
}
[tool_initialize] Successfully initialized tool.
Tried to control tool. Result = no tool
Tried to control tool. Result = no tool
Tried to control tool. Result = no tool
Tried to control tool. Result = no tool
[callback_control_tool] command = 4 | modifier = 4 | arg = (nil) | codeptr_ra = 0x590ddad43468
Tried to control tool. Result = success
[tool_finalize] Finalized tool.

With a parallel region before omp_control_tool

$ clang -fopenmp reproducer.c -DPARALLEL_REGION && ./a.out
[tool_initialize] Successfully initialized tool.
Hello World from thread 0
Hello World from thread 1
[callback_control_tool] command = 1 | modifier = 1 | arg = (nil) | codeptr_ra = 0x562a65200475
Tried to control tool. Result = success
[callback_control_tool] command = 3 | modifier = 2 | arg = (nil) | codeptr_ra = 0x562a6520048f
Tried to control tool. Result = success
[callback_control_tool] command = 2 | modifier = 3 | arg = (nil) | codeptr_ra = 0x562a652004a9
Tried to control tool. Result = success
[callback_control_tool] command = 4 | modifier = 4 | arg = (nil) | codeptr_ra = 0x562a652004c0
Tried to control tool. Result = success
[tool_finalize] Finalized tool.

A parallel region gives the results one would expect.

With a target region before omp_control_tool

$ clang -fopenmp --offload-arch=sm_80 omp_control_tool_incorrect_value.c -DTARGET_REGION && ./a.out
[tool_initialize] Successfully initialized tool.
Hello World from accelerator team 1
Hello World from accelerator team 0
Tried to control tool. Result = no tool
Tried to control tool. Result = no tool
Tried to control tool. Result = no tool
Tried to control tool. Result = no tool
[tool_finalize] Finalized tool.

A target region is not sufficient for omp_control_tool to succeed. Similar to the first case, the first parallel region will cause the call to work correctly.

llvmbot commented 5 days ago

@llvm/issue-subscribers-openmp

Author: Jan André Reuter (Thyre)

The OpenMP interface offers a simple function to communicate with an attached tool, which can be just information dumping, sanitizer like Archer or performance tools like Score-P and TAU. While playing around with `omp_control_tool` and its counterpart `ompt_control_look` I noticed that the runtime might return `omp_control_tool_notool`, even though a tool was just initialized. This mainly affects offloading constructs and calling `omp_control_tool` as its first action. My testing was unfortunately only done with LLVM 18.1.8. With LLVM trunk and an AMD GPU, trying to run any OpenMP example with offload fails with an error message. I'm building LLVM 19.1.2 right now to test against. Here's a simple reproducer: ```c #include <omp.h> #include <omp-tools.h> #include <stdio.h> /* TOOL CODE */ int callback_control_tool( uint64_t command, uint64_t modifier, void* arg, const void* codeptr_ra ) { const char* command_name; switch ( command ) { case omp_control_tool_start: { command_name = "omp_control_tool_start"; break; } case omp_control_tool_flush: { command_name = "omp_control_tool_flush"; break; } case omp_control_tool_pause: { command_name = "omp_control_tool_pause"; break; } case omp_control_tool_end: { command_name = "omp_control_tool_end"; break; } default: command_name = NULL; break; } printf( "[%s] command = %lu | modifier = %lu | arg = %p | codeptr_ra = %p\n", __FUNCTION__, command, modifier, arg, codeptr_ra ); if( command_name ) { return omp_control_tool_success; } return omp_control_tool_ignored; } int tool_initialize( ompt_function_lookup_t lookup, int initial_device_num, ompt_data_t* tool_data ) { ompt_set_callback_t set_callback = ( ompt_set_callback_t )lookup( "ompt_set_callback" ); if( !set_callback ) { return 0; } ompt_set_result_t result = set_callback( ompt_callback_control_tool, (ompt_callback_t)&callback_control_tool ); if( result != ompt_set_always ) { return 0; } printf( "[%s] Successfully initialized tool.\n", __FUNCTION__ ); return 1; /* non-zero indicates success */ } void tool_finalize( ompt_data_t* tool_data ) { printf( "[%s] Finalized tool.\n", __FUNCTION__ ); } ompt_start_tool_result_t * ompt_start_tool( unsigned int omp_version, const char* runtime_version ) { static ompt_start_tool_result_t result = { &tool_initialize, &tool_finalize, {} }; return &result; } /* USER CODE */ /* If the OMPT interface state is inactive, the OpenMP implementation returns * omp_control_tool_notool. If the OMPT interface state is active, but no callback is * registered for the tool-control event, the OpenMP implementation returns * omp_control_tool_nocallback. An OpenMP implementation may return other * implementation-defined negative values strictly smaller than -64; an application may assume that * any negative return value indicates that a tool has not received the command. A return value of * omp_control_tool_success indicates that the tool has performed the specified command. A * return value of omp_control_tool_ignored indicates that the tool has ignored the specified * command. A tool may return other positive values strictly greater than 64 that are tool-defined.*/ void print_tool_status( int controlToolReturnVal ) { switch( controlToolReturnVal ) { case omp_control_tool_notool: printf("Tried to control tool. Result = no tool\n"); break; case omp_control_tool_nocallback: printf("Tried to control tool. Result = no callback\n"); break; case omp_control_tool_success: printf("Tried to control tool. Result = success\n"); break; case omp_control_tool_ignored: printf("Tried to control tool. Result = ignored\n"); break; } } int main( void ) { #ifdef TARGET_REGION #pragma omp target teams num_teams( 2 ) { printf( "Hello World from accelerator team %d\n", omp_get_team_num() ); } #endif #ifdef PARALLEL_REGION #pragma omp parallel num_threads( 2 ) { #pragma omp critical printf( "Hello World from thread %d\n", omp_get_thread_num() ); } #endif print_tool_status( omp_control_tool( omp_control_tool_start, 1, NULL ) ); print_tool_status( omp_control_tool( omp_control_tool_flush, 2, NULL ) ); print_tool_status( omp_control_tool( omp_control_tool_pause, 3, NULL ) ); print_tool_status( omp_control_tool( omp_control_tool_end, 4, NULL ) ); return 0; } ``` The program tries to call `omp_control_tool` with the four specified options. Additional ones can be defined by a tool. The user code prints the result of `omp_control_tool`, the tool prints which command was used. Via defines, one can enable a target region and a parallel region. **Without any OpenMP directives** With LLVM 18.1.8, I get these results: ```console $ clang -fopenmp reproducer.c && ./a.out [tool_initialize] Successfully initialized tool. Tried to control tool. Result = no tool Tried to control tool. Result = no tool Tried to control tool. Result = no tool Tried to control tool. Result = no tool [tool_finalize] Finalized tool. ``` This is broken. Even though a tool was initialized, the user code never gets that information. Once the first parallel region was created we finally get the information that a tool is attached and working. ```c int main( void ) { #ifdef TARGET_REGION #pragma omp target teams num_teams( 2 ) { printf( "Hello World from accelerator team %d\n", omp_get_team_num() ); } #endif #ifdef PARALLEL_REGION #pragma omp parallel num_threads( 2 ) { #pragma omp critical printf( "Hello World from thread %d\n", omp_get_thread_num() ); } #endif print_tool_status( omp_control_tool( omp_control_tool_start, 1, NULL ) ); print_tool_status( omp_control_tool( omp_control_tool_flush, 2, NULL ) ); print_tool_status( omp_control_tool( omp_control_tool_pause, 3, NULL ) ); print_tool_status( omp_control_tool( omp_control_tool_end, 4, NULL ) ); #pragma omp parallel {} print_tool_status( omp_control_tool( omp_control_tool_end, 4, NULL ) ); return 0; } ``` ```console [tool_initialize] Successfully initialized tool. Tried to control tool. Result = no tool Tried to control tool. Result = no tool Tried to control tool. Result = no tool Tried to control tool. Result = no tool [callback_control_tool] command = 4 | modifier = 4 | arg = (nil) | codeptr_ra = 0x590ddad43468 Tried to control tool. Result = success [tool_finalize] Finalized tool. ``` **With a parallel region before `omp_control_tool`** ```console $ clang -fopenmp reproducer.c -DPARALLEL_REGION && ./a.out [tool_initialize] Successfully initialized tool. Hello World from thread 0 Hello World from thread 1 [callback_control_tool] command = 1 | modifier = 1 | arg = (nil) | codeptr_ra = 0x562a65200475 Tried to control tool. Result = success [callback_control_tool] command = 3 | modifier = 2 | arg = (nil) | codeptr_ra = 0x562a6520048f Tried to control tool. Result = success [callback_control_tool] command = 2 | modifier = 3 | arg = (nil) | codeptr_ra = 0x562a652004a9 Tried to control tool. Result = success [callback_control_tool] command = 4 | modifier = 4 | arg = (nil) | codeptr_ra = 0x562a652004c0 Tried to control tool. Result = success [tool_finalize] Finalized tool. ``` A parallel region gives the results one would expect. **With a target region before `omp_control_tool`** ```console $ clang -fopenmp --offload-arch=sm_80 omp_control_tool_incorrect_value.c -DTARGET_REGION && ./a.out [tool_initialize] Successfully initialized tool. Hello World from accelerator team 1 Hello World from accelerator team 0 Tried to control tool. Result = no tool Tried to control tool. Result = no tool Tried to control tool. Result = no tool Tried to control tool. Result = no tool [tool_finalize] Finalized tool. ``` A target region is not sufficient for `omp_control_tool` to succeed. Similar to the first case, the first parallel region will cause the call to work correctly.
Thyre commented 5 days ago

The results match with LLVM 19.1.2. Like mentioned, I'm currently unable to test LLVM/Clang trunk on my workstation due to the following error when trying to run offloaded code:

: CommandLine Error: Option 'abort-on-max-devirt-iterations-reached' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
[1]    3101877 IOT instruction (core dumped)  ./a.out