grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.28k stars 172 forks source link

Using otelcol.exporter.oltphttp to send logs to Loki OTLP points fails due to 204 response #1272

Open loafoe opened 2 months ago

loafoe commented 2 months ago

What's wrong?

Attempting to push logs from otelcol.exporter.olpthttp directly to a Loki 3.1.0 OTLP endpoint seems to fail i.e. logs are not acklowledged as being sent even though I can see the logs appear in Loki. This seems to be caused by Loki responding with a HTTP 204 which is not recognized as success by the oltphttp component.

I found a possible otel related upstream issue which was recently fixed:

https://github.com/open-telemetry/opentelemetry-cpp/issues/2632 https://github.com/open-telemetry/opentelemetry-cpp/pull/2712

If the otlphttp component is using this library then this would explain the above observation.

When I introduce a Caddy proxy in between with the following config, the problem goes away!

reverse_proxy /v1/logs {
    to loki-gateway.loki-system.svc:80
    @204 {
       status 204
    }
    handle_response @204 {
       copy_response 200
    }
    rewrite /otlp/v1/logs
}

Can alloy update to the latest opentelementry-cpp version to solve this?

Steps to reproduce

System information

Linux grafana-alloy-1 6.1.92

Software version

Grafana Alloy v1.2.1

Configuration

loki.source.kubernetes "pods" {
  targets    = discovery.relabel.loki_relabel.output
  forward_to = [
    otelcol.receiver.loki.local.receiver,
  ]
}

otelcol.exporter.otlphttp "local" {
  client {
    # This endpoint rewrites /v1/logs to /otlp/v1/logs
    endpoint = "http://otlp-proxy.starlift-observability.svc"
    auth = otelcol.auth.headers.local.handler

    tls {
      insecure = false
    }
  }
}

Logs

ts=2024-07-12T08:18:33.463303168Z level=error msg="failed to consume log entries" component_path=/ component_id=otelcol.receiver.loki.local err="sending queue is full"
github-actions[bot] commented 1 month ago

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it. If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue. The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity. Thank you for your contributions!