open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/docs/languages/go
Apache License 2.0
5.2k stars 1.04k forks source link

otlpmetrichttp exporter is sending generic error msg instead of actual collector msg in case of using loadbalancer exporter #5536

Closed Preeti-Dewani closed 2 months ago

Preeti-Dewani commented 3 months ago

Description

If user forget to add attribute service.name while creating resource object using resource.New and otlpmetrichttp with a collector setup of loadbalancing exporter (which uses service for routing purpose), error msg that user receives in scenario doesn't say anything about the actual problem, this is the actual error msg that user gets

failed to upload metrics: context deadline exceeded: retry-able request failure

this doesn't happen with otlpmetricgrpc exporter, it sends the actual error message that comes from collector, this is the error msg.

failed to upload metrics: context deadline exceeded: rpc error: code = Unavailable desc = unable to get service name

Issue: it is difficult to figure out the actual reason behind the problem and it is happening because underlying collector sends the actual reason in response body, which otlpmetrichttp discards in case of retryable code and actual issue behind the failure gets missed.

Part1: otlpmetrichttp -----------api call------------> otel-collector (loadbalancer exporter)

Part2: otel-collector --- sends error via---> otlpmetric UploadMetrics func

Part3: otlpmetric UploadMetrics func --- discards body ---> sends error retry-able request failure instead of actual error

Setup Details

Screenshot 2024-06-24 at 6 30 17 PM

Environment

Steps To Reproduce

  1. create a sample golang application with these details
    options := []otlpmetrichttp.Option{
        otlpmetrichttp.WithEndpoint(*collectorEndpoint),
        otlpmetrichttp.WithURLPath(*collectorURL),
    }
    if !*isSecure {
        options = append(options, otlpmetrichttp.WithInsecure())
    }
    metricExporter, _ := otlpmetrichttp.New(ctx, options...)
    reader := metric.NewPeriodicReader(metricExporter, metric.WithInterval(*pushInterval))

    resourceConfig, _ := resource.New(ctx, resource.WithAttributes(
               attribute.String("service_name", "myapp"),
               attribute.String("job", "sample-job"), attribute.String("instance", "sample-instance")))
    meterProvider := metric.NewMeterProvider(
        metric.WithResource(resourceConfig),
        metric.WithReader(reader),
    )
  1. Export env variable

    export ENDPOINT_DOMAIN='otel-gateway:4317' export REMOTE_WRITE_URL='http://victoriametrics:8429/api/v1/write'

  2. Create golang app Dockerfile and build it under myapp in docker compose

docker-compose.yaml

version: '3.7'
services:
  myapp:
    build:
      context: .
      dockerfile: Dockerfile
    command:
      - "--endpointDomain=${ENDPOINT_DOMAIN}"
      - "--ingestPath="
      - "--isSecure=false"
    ports:
      - "8081:8081"

  otel-collector-1:
    container_name: otel-collector-1
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otel-collector-config.yaml"]
    environment:
      - REMOTE_WRITE_URL=${REMOTE_WRITE_URL}
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317"  # OTLP grpc receiver

  otel-collector-2:
    container_name: otel-collector-2
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otel-collector-config.yaml"]
    environment:
      - REMOTE_WRITE_URL=${REMOTE_WRITE_URL}
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317"   # OTLP grpc receiver

  otel-collector-3:
    container_name: otel-collector-3
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otel-collector-config.yaml"]
    environment:
      - REMOTE_WRITE_URL=${REMOTE_WRITE_URL}
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317"   # OTLP grpc receiver

  # Otel gateway (running loadbalacing exporter)
  otel-gateway:
    container_name: otel-gateway
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otel-gateway-config.yaml"]
    volumes:
      - ./otel-gateway-config.yaml:/etc/otel-gateway-config.yaml
    ports:
      - "4317:4317"        # OTLP http receiver
    depends_on:
      - otel-collector-1
      - otel-collector-2
      - otel-collector-3

  victoriametrics:
    container_name: victoriametrics
    image: victoriametrics/victoria-metrics
    ports:
      - "8439:8429"
    volumes:
      - victoriametricsdata:/victoriametricsdata
    command:
      - "-storageDataPath=/victoriametricsdata"
      - "-retentionPeriod=30"
      - "-httpListenAddr=:8429"
    restart: always

volumes:
  victoriametricsdata: { }

otel-gateway-config.yaml

receivers:
  otlp:
    protocols:
      http:
        endpoint: 0.0.0.0:4317

processors:

exporters:
  debug:
    verbosity: detailed
  loadbalancing:
    protocol:
      otlp:
        timeout: 5s
        tls:
          insecure: true
    resolver:
      static:
        hostnames:
          - otel-collector-1:4317
          - otel-collector-2:4317
          - otel-collector-3:4317

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: []
      exporters: [loadbalancing]

otel-collector-config.yaml

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

processors:

exporters:
  debug:
    verbosity: detailed
  prometheusremotewrite: # the PRW exporter, to ingest metrics to backend
    endpoint: ${REMOTE_WRITE_URL}

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: []
      exporters: [prometheusremotewrite]
  1. docker compose up

  2. Make an api call to the myapp on http://localhost:8081

Expected behavior

Solution will send messages as it is received from collector.

failed to upload metrics:unable to get service name
Preeti-Dewani commented 3 months ago

PR for the fix: https://github.com/open-telemetry/opentelemetry-go/pull/5541