Open cyrille-leclerc opened 5 months ago
Given the removal of the lokiexporter in September, this feature gap hits us pretty hard as Grafana Cloud users. Any update on the possibility of promoting resource attributes to indexed labels on Loki using the OTLP exporter & endpoint?
@fredrikgh can you please help us understand what kind of attribute you want to promote as Loki labels? Are they additional standard resource attributes? custom resource attributes? What information do these attributes convey?
@stevendungan can you please help here?
Following your question @cyrille-leclerc, here's my use-case for what it's worth
I wanted to use this functionality as a way to circumvent the fact that le level
field is no longer present. It's replaced with detected_level
in loki 3.1 but that is not supported by grafana and is not indexed. There's a bug for this of course.
I also have a custom field that I use a lot in the dropdown in the explore view, similar to service_name
. If my field cannot get indexed I won't have it in the drop-downs in grafana cloud.
Grafana Cloud documentation says:
Because it is too costly from a cardinality perspective, Grafana Loki indexes a few attributes from log entries instead of indexing all available attributes or the entire log message. As such, you must provide hints to the Loki translator, stating which attributes to promote to Loki labels. You can do this by adding new synthetic attributes, which are read by the Loki translator and removed before the data is sent over the network. The following snippet shows how the processors section looks when you add a resource processor that adds the loki.resource.labels hint. This example tells the Loki translator that the host_name resource attribute should be promoted to a label. You are not required to add labels, and every entry that passes through the Loki exporter will have a static label exporter with the value OTLP by default. For more information about labels and how to chose the right ones for your use case, refer to the Loki documentation.
But this behavior doesn't actually work when sending over OTLP to the Grafana Cloud OTLP endpoint in our experience for any resource attribute we want to promote to a label.
@fredrikgh can you please help us understand what kind of attribute you want to promote as Loki labels? Are they additional standard resource attributes? custom resource attributes? What information do these attributes convey?
One example we had was to have loki labels for exception
and/or scope
of a log entry, i.e. custom attributes.
Grafana Cloud documentation says:
... But this behavior doesn't actually work when sending over OTLP to the Grafana Cloud OTLP endpoint in our experience for any resource attribute we want to promote to a label.
@adrielp This documentation is outdated, it predates the introduction of Loki structured metadata, we are going to refresh this section.
Please use OTel log attributes to capture logs metadata (eg thread.name...). Note that the OTel auto instrumentation of logging frameworks is usually capable of capturing interesting metadata.
We are sorry for the inconvenience. Would this solution meet your expectations?
One example we had was to have loki labels for
exception
and/orscope
of a log entry, i.e. custom attributes.
Thanks @fredrikgh , would you by any chance have example values and a sense of the cardinality?
In particular, I would be interested in understanding:
exception
is it:
true/false
to have a different data management policy, for example different retention policy?NullPointerException
InvalidFormatException: '123azerty' is not a valid integer
scope
is it:
com.mycompany.OrderService
Thanks @cyrille-leclerc - glad the updates are going to be made. I'd also keep an eye on the entity OTEP that relates to resource attributes. I think these types of things will be important for labels as things evolve.
Thanks @fredrikgh , would you by any chance have example values and a sense of the cardinality?
In particular, I would be interested in understanding:
exception
is it:
- Just a marker like
true/false
to have a different data management policy, for example different retention policy?- The exception type like
NullPointerException
- Or also include the exception message like
InvalidFormatException: '123azerty' is not a valid integer
scope
is it:
- A reference to the OpenTelemetry instrumentation scope name which is mapped to the logger name by the OTel auto instrumentation of logging framework, for example
com.mycompany.OrderService
@cyrille-leclerc It would be NullPointerException
and com.mycompany.OrderService
respectively. I suppose technically, these aren't to be considered resource attributes. But some mechanism of getting these indexed would be very useful.
@adrielp: Thanks @cyrille-leclerc - glad the updates are going to be made. I'd also keep an eye on the https://github.com/open-telemetry/oteps/pull/264 that relates to resource attributes. I think these types of things will be important for labels as things evolve.
We are aligned here, we have several engineers who contribute to this OTEP, both to surface better the concept of entities in OTel and to hlp improve the support for high dimensionality in Prometheus
@fredrikgh: @cyrille-leclerc It would be
NullPointerException
andcom.mycompany.OrderService
respectively. I suppose technically, these aren't to be considered resource attributes. But some mechanism of getting these indexed would be very useful.
Thanks @fredrikgh. Please pardon my curiosity but what is your use case for this level of details in labels and thus this cardinality on the log streams?
Applications in java have hundreds of logger name (eg com.mycompany.OrderService
) and use dozens of exception classes (NullPointerException
).
I suspect we may not be aware with the use case you are solving here.
@cyrille-leclerc we were misusing them initially. We have a limitation on error metrics exported by the apps, and built log data dashboards for log meta analysis instead. E.g. error count by certain metadata, backed by recording rules. But we've accomplished this now with label_format
and all is well.
Getting standard resource attributes such as cluster
, node
, pod
etc as indexed labels is a more valid use case, and more fitting to resource attributes. I may have missed it, but have you settled on how you intend to make this possible? This is indeed where we used loki_resource_labels
before.
Is your feature request related to a problem? Please describe.
Context: As discussed with @sandeepsukhani and many others, we want to simplify Loki's OpenTelemetry ingestion path and move away from the otel2loki converters available through the OpenTelemetry Collector Loki Exporter and the Alloy
otelcol.exporter.loki
in favor of the newly introduced Loki OTLP Endpoint.However, we have identified the limitation to specify OTel resource attributes that should be promoted as Loki labels:
distributor: otlp_config / default_resource_attributes_as_index_labels
(docs here), Grafana Cloud Logs and the Grafana Cloud OTLP Endpoint does provide such a stack wide config option.loki.resource.labels
attributes that was available when using the OpenTelemetry Collector Loki ExporterDescribe the solution you'd like
I would like
distributor: otlp_config / default_resource_attributes_as_index_labels
(docs here) to overwrite the default list of resource attributes that are promoted as labels.loki.resource.labels
mechanism as the desired solution would not be about overwriting the global list of resource attributes promoted as labels (seedefault_resource_attributes_as_index_labels
) but to extend it.Describe alternatives you've considered
Continue to do the otel2loki conversion through the OpenTelemetry Collector Loki Exporter and Alloy
otelcol.exporter.loki
but it's more burden put on the Loki users and none of these converters leverage Loki V3 metadata.Additional context
Similar to the problem Grafana Labs Community - Add additional index labels in Loki 3.0 via OTLP