ROCm / rocprofiler-sdk

MIT License
4 stars 1 forks source link

[Feature]: filtering #4

Open maartenarnst opened 2 months ago

maartenarnst commented 2 months ago

Suggestion Description

The doc for rocprofv3 gives a clear overview of the features. However, notably, it appears that "filtering" is not/no longer a feature. By filtering, I mean restricting the profiling e.g. to a kernel with a specific name, to a region identified with roctx markers, to a region identified with start/stop markers, ... I just wanted to ask the question whether support for such filtering is not planned for rocprofv3 or whether it is planned to be added at a later stage?

Operating System

No response

GPU

No response

ROCm Component

No response

jrmadsen commented 2 months ago

Hi @maartenarnst, if you use the new rocprofiler-sdk-roctx, there are new ROCTx functions: roctxProfilerPause and roctxProfilerResume. This will completely shut off data collection of everything in rocprofv3.

Sample

#include <rocprofiler-sdk-roctx/roctx.h>

__global__ void
transpose(const int* in, int* out, int M, int N);

int main()
{
    auto tid = roctx_thread_id_t{};
    // get the thread id recognized by rocprofiler-sdk from roctx
    roctxGetThreadId(&tid);

    for(size_t i = 0; i < 100; ++i)
    {
        transpose<<<256, 64>>>(...);
        if(i == 0)
        {
            // first kernel will be profiled, all subsequent kernel launchs and/or API calls will not be collected
            // NOTE: there is no need for any device/stream "sync" here
            roctxProfilerPause(tid);
        }
    }
    hipDeviceSynchronize();

    return 0;
}

You are welcome to test out this feature. Anything command-line based will probably be supported eventually but we have rewritten the rocprofiler library from scratch so rocprofv3 is a secondary focus while this foundational work is still under development (i.e. it's hard to build a full-featured tool when the library that the tool is built on top of is still being written/designed).