oneapi-src / level-zero-spec

MIT License
18 stars 27 forks source link

Support for L0 Tracer Metrics #285

Open matcabral opened 8 months ago

matcabral commented 8 months ago

Summary

Extend L0 Metrics support with new collection paradigm that allows retrieving asynchronous events. The proposed sampling name is "Tracer based metrics"

Details

Motivation

Existing L0 metrics collection modes (streamer and query) are limited to events that are produced in a defined occurrence rate (defined during configuration). Therefore, the proposal is to add extensions APIs to allow events from different nature, for example asynchronous.

Interoperability with Other APIs

A new type of sampling will be added to https://spec.oneapi.io/level-zero/latest/tools/api.html#zet-metric-group-sampling-type-flags-t to differentiate what metric groups can be used with the new set of APIs. APIs that are independent of collection mode (e.g. zetMetricGroupGet(), zetMetricGroupGetProperties(), zetMetricGet() ) will work with all metric groups.

Proposed APIs

New Enumerations

ZET_METRIC_SAMPLING_TYPE_EXP_FLAG_TRACER_BASED

extend the sampling types

zet_metric_group_sampling_type_flags_t {
...
ZET_METRIC_SAMPLING_TYPE_EXP_FLAG_TRACER_BASED
}

New Stypes

ZET_STRUCTURE_TYPE_METRIC_TRACER_DESC_EXP

extend tools stypes

zet_structure_type_t {
...
ZET_STRUCTURE_TYPE_METRIC_TRACER_DESC_EXP
}

New Handles

zet_metric_tracer_exp_handle_t

Metric tracer Handle

New Structures

zet_metric_tracer_desc_t

zet_metric_tracer_desc_t {
        zet_structure_type_t stype;  
        const void *pNext;  
        uint32_t notifyEveryNBytes; 
}
Attribute Description
stype [in] expected to be set to ZET_STRUCTURE_TYPE_METRIC_TRACER_DESC_EXP
pNext [in,out][optional] must be null or a pointer to an extension-specific structure (i.e. contains stype and pNext)
notifyEveryNBytes [in,out] number of collected bytes after which notification event will be signaled. If the requested value is not supported exactly, then the driver may use a value that is the closest supported approximation and shall update this member during zetMetricTracerCreate()

New Functions

zetMetricTracerCreateExp

zetMetricTracerCreateExp(zet_context_handle_t hContext, zet_device_handle_t hDevice, uint32_t metricGroupCount, zet_metric_group_handle_t *phMetricGroups, zet_metric_tracer_exp_desc_t *desc, ze_event_handle_t hNotificationEvent,
zet_metric_tracer_exp_handle_t *phMetricTracer);

Open a metric tracer on a device.

Parameter Description
hContext [in] handle of the context object
hDevice [in] handle of the device
metricGroupCount [in] metric group count
phMetricGroups [in][range(0, metricGroupCount )] handles of the metric groups to trace
desc [in,out] metric tracer descriptor
hNotificationEvent [in][optional] event used for report availability notification. Note: If buffer is not drained when the event it flagged, there is a risk of HW event buffer being overrun
phMetricTracer [out] handle of the metric tracer

zetMetricTracerDestroyExp

zetMetricTracerDestroyExp( zet_metric_tracer_exp_handle_t hMetricTracer);

Deletes the metric tracer object

Parameter Description
hMetricTracer [in] handle of the metric tracer

zetMetricTracerEnableExp

zetMetricTracerEnableExp(zet_metric_tracer_exp_handle_t hMetricTracer,  bool synchronous);

Lightweight call that starts the event collections.

Parameter Description
hMetricTracer [in] handle of the metric tracer
synchronous [in] request synchronous behavior

zetMetricTracerDisableExp

zetMetricTracerDisableExp( zet_metric_tracer_exp_handle_t hMetricTracer,  bool synchronous);

Lightweight call that stops the event collections.

Parameter Description
hMetricTracer [in] handle of the metric tracer
synchronous [in] request synchronous behavior

zetMetricTracerReadDataExp

zetMetricTracerReadDataExp(zet_metric_tracer_exp_handle_t hMetricTracer, size_t *pRawDataSize, uint8_t *pRawData);

Reads data from metric tracer

Parameter Description
hMetricTracer [in] handle of the metric tracer
pRawDataSize [in,out] pointer to size in bytes of raw data requested to read. if size is zero, then the driver will update the value with the total size in bytes needed for all data available. if size is non-zero, then driver will only retrieve the amount of data that fits into the buffer. If size is larger than size needed for all data, then driver will update the value with the actual size needed
pRawData [in,out][optional][range(0, *pRawDataSize)] buffer containing tracer events in raw format

Usage Example


    zet_metric_group_handle_t     hMetricGroup           = nullptr;
    ze_event_handle_t            hNotificationEvent     = nullptr;
    ze_event_pool_handle_t       hEventPool             = nullptr;
    ze_event_pool_desc_t         eventPoolDesc          = {ZE_STRUCTURE_TYPE_EVENT_POOL_DESC, nullptr, 0, 1};
    ze_event_desc_t              eventDesc              = {ZE_STRUCTURE_TYPE_EVENT_DESC};
    zet_metric_tracer_exp_handle_t hMetricTracer;

    // Find a metric group suitable for Tracer Based collection

    FindMetricGroup( hDevice,  ZET_METRIC_SAMPLING_TYPE_EXP_FLAG_TRACER_BASED, &hMetricGroup );

    // Configure the HW

    zetContextActivateMetricGroups( hContext, hDevice, /* count= */ 1, &hMetricGroup );

    // Create notification event

    zeEventPoolCreate( hContext, &eventPoolDesc, 1, &hDevice, &hEventPool );
    eventDesc.index  = 0;
    eventDesc.signal = ZE_EVENT_SCOPE_FLAG_HOST;
    eventDesc.wait   = ZE_EVENT_SCOPE_FLAG_HOST;
    zeEventCreate( hEventPool, &eventDesc, &hNotificationEvent );

     // Create tracer

      zet_metric_tracer_exp_desc_t tracerDescriptor = {
      ZET_STRUCTURE_TYPE_TRACER_EXP_DESC, 
      nullptr, 1024};

    zetMetricTracerCreateExp(hContext, hDevice, 1, hMetricGroup , &tracerDescriptor, hNotificationEvent, &hMetricTracer);

    // Enable the tracer

    zetMetricTracerEnableExp(hMetricTracer, true);

    // Run workload 

    workload(hDevice);

    // Wait for data, optional

    zeEventHostSynchronize( hNotificationEvent, 1000 /*timeout*/ );

    size_t rawDataSize = 0;
    zetMetricTracerReadDataExp(hMetricTracer, &rawDataSize, nullptr);
    uint8_t* rawData = malloc(rawDataSize);
    zetMetricTracerReadDataExp(hMetricTracer, &rawDataSize, rawData);

    // Close metric tracer

    zetMetricTracerDisableExp(hMetricTracer, true);
    zetMetricTracerDestroyExp(hMetricTracer);
    zeEventDestroy( hNotificationEvent );
    zeEventPoolDestroy( hEventPool );

    // Clean device configuration

    zetContextActivateMetricGroups( hContext, hDevice, 0, nullptr );
    free(rawData);