ocsf / ocsf-schema

OCSF Schema
Apache License 2.0
634 stars 136 forks source link

Proposal: Extend OCSF Support for Metrics and Traces with New Trace Profile and Metrics Event Class #1229

Closed pladamgregory closed 3 weeks ago

pladamgregory commented 4 weeks ago

Summary

The current OCSF schema lacks a structured way to represent metrics and traces, which are critical data types in observability and monitoring contexts. This issue proposes:

  1. The addition of a Trace Profile to tag OCSF events as traces, incorporating trace-related identifiers.
  2. The addition of a Metrics Info Class (5023) within Category 5 (Discovery) to structure metric attributes, enabling consistent tagging and categorization of metrics data.

Proposed Additions

  1. Trace Profile

    • Introduce a trace profile to the schema, containing standard trace attributes like trace_id, span_id, and other metadata relevant to traces.
    • This profile will allow any OCSF event to be tagged as a trace, supporting enhanced tracking and root cause analysis within observability platforms.
  2. Metrics Info Class (5023) for Discovery (Category 5)

    • A new class, Metrics Info Class (5023), under Category 5 (Discovery) would be introduced to handle metric-related data.
    • This class would define a set of Activity IDs and a Type ID attribute for categorizing metric types, as detailed below.

Attributes for Metrics Info Class (5023)

Type ID Description
0 Unknown
1 Timestamp
2 Duration
3 Frequency
4 Latency
10 CPU Usage
11 Memory Usage
12 Disk I/O
13 Network Throughput
14 Queue Length
15 Thread Count
16 Execution Time
17 Resource Utilization
18 Disk Space Usage
19 Heap Size
20 Cache Hit Rate
21 Transaction Rate
22 Error Rate
23 Request Count
24 Success Rate
25 Concurrency
26 Response Time
27 Active Users
28 Session Duration
29 User Actions
30 Page Views
31 Error Count
32 Failure Rate
33 Retry Count
34 Downtime
35 System Uptime
36 Service Availability
37 Temperature
38 Battery Level
39 Data Volume
40 Data Quality
41 Compression Ratio
42 API Request Rate
43 API Error Rate
44 Cost Utilization
45 Instance Uptime
46 Threat Detection Rate
47 Security Event Count
48 Failed Authentication
49 Intrusion Detection
51 Model Accuracy
52 Model Latency
53 Training Time
54 Inference Count
55 Data Drift
56 Revenue
57 Customer Churn Rate
58 Conversion Rate
59 Customer Satisfaction
99 Other — See type_name
pagbabian-splunk commented 3 weeks ago

Looks good Adam. Have we compared this with how OTel represents metrics and traces? Can the two complement each other? We have work going on to encode OCSF as protobuf, which then could be transported via OTel protocols. They had interest in more collaboration.

pladamgregory commented 3 weeks ago

Looks good Adam. Have we compared this with how OTel represents metrics and traces? Can the two complement each other? We have work going on to encode OCSF as protobuf, which then could be transported via OTel protocols. They had interest in more collaboration.

I will make sure to provide some data here on the comparisons, but this was made with OTel in mind. The idea is that streams but OTel can fork into OCSF-able records which can be tagged with the trace profile, these would be the same for metrics as well which can be captured using this same implementation/or directly for example with winperfmon logs etc.

pladamgregory commented 3 weeks ago

Rolling up to https://github.com/ocsf/ocsf-schema/issues/1234