Open qixiaogang opened 2 months ago
what should be the metric here - no. of logs (batch size) that failed to get exported or simply failed export occurrence ?
hi @harshitrjpt, thanks for checking. I would prefer no. of logs (batch size) that failed to get exported.
hmm.. i was thinking- the end user should be worried about the export failure occurrence or how many logs failed to get exported. How should it matter whether 1 log failed to get exported or 10 logs ?
hi @harshitrjpt Thanks, in some cases, end user need to know how much logs got loss due to export failure.
@qixiaogang : If that is the use case, then we would just need to define another attribute, similar to 'dropped', for export failure and reuse the same 'processedLogs' metric.
But i am still inclined towards tracking export failure occurrence rather than no. of logs that failed to get exported, because the failure is about the exporter and not the logs. I'll let the core maintainers- @jkwatson , @jack-berg comment on which way we should go. (i already have the code added and tested in local for export failure occurrence, just need to create PR)
@qixiaogang I think there is already an existing feature https://github.com/open-telemetry/opentelemetry-java/blob/main/exporters/common/src/main/java/io/opentelemetry/exporter/internal/ExporterMetrics.java you can leverage for your requirement. The ExporterMetrics in generically addresses this need for all exporters. Credits to @jack-berg for pointing out on my above commit.
I used this on my local repro by enabling By default got the desired result when using autoconfigure.
The OTEL_EXPORTER_METRICS_ENABLED=true
andotlp.exporter.exported
metric with datapoints - success=false,type=log
appropriately tracks the log export failure. Please check if this addresses your need.
ScopeMetrics #2
ScopeMetrics SchemaURL:
InstrumentationScope io.opentelemetry.exporters.otlp-grpc
Metric #0
Descriptor:
-> Name: otlp.exporter.exported
-> Description:
-> Unit:
-> DataType: Sum
-> IsMonotonic: true
-> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
-> success: Bool(false)
-> type: Str(log)
StartTimestamp: 2024-10-14 10:06:54.40763 +0000 UTC
Timestamp: 2024-10-14 10:10:54.418291 +0000 UTC
Value: 9
Metric #1
Descriptor:
-> Name: otlp.exporter.seen
-> Description:
-> Unit:
-> DataType: Sum
-> IsMonotonic: true
-> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
-> type: Str(log)
StartTimestamp: 2024-10-14 10:06:54.40763 +0000 UTC
Timestamp: 2024-10-14 10:10:54.418291 +0000 UTC
Value: 9
Is your feature request related to a problem? Please describe. In BatchLogRecordProcessor, it exposed metrics for dropped and exported logs, but didn't expose for failed logs. while export failure is critical and need to expose as metrics.
Describe the solution you'd like Expose failed export as metrics also.
Describe alternatives you've considered NA
Additional context NA