open-telemetry / opentelemetry-ebpf-profiler

The production-scale datacenter profiler (C/C++, Go, Rust, Python, Java, NodeJS, .NET, PHP, Ruby, Perl, ...)
Apache License 2.0
2.42k stars 262 forks source link

The eBPFMetricsCollector function incorrectly handled the counter metric. #139

Closed tsint closed 2 months ago

tsint commented 2 months ago

The method(https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/tracer/tracer.go#L850-L864) used in the eBPFMetricsCollector function to evaluate cumulative metrics(counters) is incorrect. For example, the value of V8SymbolizationFailure(142) is greater than the value of support.MetricIDBeginCumulative(96).

        // The monitoring infrastructure expects instantaneous values (gauges).
        // => for cumulative metrics (counters), send deltas of the observed values, so they
        // can be interpreted as gauges.
        if ebpfID < support.MetricIDBeginCumulative {
            // We don't assume 64bit counters to overflow
            deltaValue := value - previousMetricValue[ebpfID]

            // 0 deltas add no value when summed up for display purposes in the UI
            if deltaValue == 0 {
                continue
            }

            previousMetricValue[ebpfID] = value
            value = deltaValue
        }
fabled commented 2 months ago

For example, the value of V8SymbolizationFailure(142) is greater than the value of support.MetricIDBeginCumulative(96).

These are not compared. The compare is against ebpfID or key of translateIDs or the C.metricID_* values which are defined in support/ebpf/types.h. What you refer to is seen in the value of the translateIDs map.

tsint commented 2 months ago

@fabled Oh, I see. Thank you for your answer.