open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.02k stars 2.33k forks source link

awsfirehosereceiver fails with "Invalid Firehose request" #16583

Closed vasiliy-grinko closed 1 year ago

vasiliy-grinko commented 1 year ago

Component(s)

receiver/awsfirehose

What happened?

Description

I'm sending metrics from CloudWatch metric stream (json format) to Kenesis Firehose delivery stream and in it I specify http endpoint of otel-collector. In logs of otel-collector I see:

2022-12-02T15:25:51.453Z    error    awsfirehosereceiver@v0.66.0/receiver.go:164    Invalid Firehose request    {"kind": "receiver", "name": "awsfirehose", "pipeline": "metrics", "error": "missing request id in header"}
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsfirehosereceiver.(*firehoseReceiver).ServeHTTP
    github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsfirehosereceiver@v0.66.0/receiver.go:164
go.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1
    go.opentelemetry.io/collector@v0.66.0/config/confighttp/compression.go:162
net/http.HandlerFunc.ServeHTTP
    net/http/server.go:2109
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP
    go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.36.4/handler.go:204
go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP
    go.opentelemetry.io/collector@v0.66.0/config/confighttp/clientinfohandler.go:39
net/http.serverHandler.ServeHTTP
    net/http/server.go:2947
net/http.(*conn).serve
    net/http/server.go:1991

Steps to Reproduce

otelcol conf:

...
awsfirehose:
    endpoint: 0.0.0.0:4433
    record_type: cwmetrics
...

Firehose manifest:

apiVersion: kinesis.aws.jet.crossplane.io/v1alpha1
kind: FirehoseDeliveryStream
metadata:
  name: shops-cw-metrics-to-amp
spec:
  forProvider:
    name: cw-metrics-to-amp
    destination: http_endpoint
    serverSideEncryption: 
    - enabled: false
    httpEndpointConfiguration:
    - url: "https://otel-collector.***.net"
      name: otel-collector
      roleArn: arn:aws:iam::*:role/otel-collector
      requestConfiguration:
      - contentEncoding: NONE
      s3BackupMode: FailedDataOnly
    s3Configuration:
      - bucketArn: arn:aws:s3:::cw-metrics-backup-for-kenesis
        roleArn: arn:aws:iam::*:role/otel-collector
        compressionFormat: "GZIP"
    region: eu-central-1
  providerConfigRef:
    name: providerconfig-jet-awssc

CW stream manifest:

apiVersion: cloudwatch.aws.jet.crossplane.io/v1alpha1
kind: MetricStream
metadata:
  name: shops-cw-metrics-to-amp
spec:
  forProvider:
    name: cw-metric-stream-to-amp
    roleArn: arn:aws:iam::*:role/otel-collector
    firehoseArn: arn:aws:firehose:eu-central-1:*:deliverystream/cw-metrics-to-amp
    includeFilter:
    - namespace: "AWS/ApplicationELB"
    - namespace: "AWS/DocDB"
    - namespace: "AWS/EC2"
    - namespace: "AWS/EBS"
    - namespace: "AWS/Lambda"  
    - namespace: "ContainerInsights" 
    outputFormat: json
    region: eu-central-1
  providerConfigRef:
    name: providerconfig-jet-awssc

I tried to send json with additional headers manually but it led to:

2022-12-02T15:32:37.107Z    debug    awsfirehosereceiver@v0.66.0/receiver.go:171    Processing Firehose request    {"kind": "receiver", "name": "awsfirehose", "pipeline": "metrics", "RequestID": "1234asdfgasfdasdgf"}
2022-12-02T15:32:37.107Z    error    awsfirehosereceiver@v0.66.0/receiver.go:230    Unable to consume records    {"kind": "receiver", "name": "awsfirehose", "pipeline": "metrics", "error": "record format invalid"}
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsfirehosereceiver.(*firehoseReceiver).ServeHTTP
    github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsfirehosereceiver@v0.66.0/receiver.go:230
go.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1
    go.opentelemetry.io/collector@v0.66.0/config/confighttp/compression.go:162
net/http.HandlerFunc.ServeHTTP
    net/http/server.go:2109
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP
    go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.36.4/handler.go:204
go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP
    go.opentelemetry.io/collector@v0.66.0/config/confighttp/clientinfohandler.go:39
net/http.serverHandler.ServeHTTP
    net/http/server.go:2947
net/http.(*conn).serve
    net/http/server.go:1991

Expected Result

metrics are received and transferred to exporter in otel-collector

Actual Result

receiving fails with the error above

Collector version

v0.66.0

Environment information

Environment

Kubernetes v1.23.13-eks-fb459a0

OpenTelemetry Collector configuration

receivers:
      otlp:
        protocols:
            grpc:
      awsfirehose:
        endpoint: 0.0.0.0:4433
        record_type: cwmetrics

    exporters:
      prometheus:
        endpoint: "0.0.0.0:8889"
      prometheusremotewrite:
        endpoint: "..."
        tls:
          insecure: true
        auth:
          authenticator: sigv4auth
        resource_to_telemetry_conversion:
          enabled: true
      awsemf:
        region: 'eu-central-1'
        resource_to_telemetry_conversion:
          enabled: true
      awsxray:
        region: 'eu-central-1'

    processors:
      filter/metrics:
      batch:
      spanmetrics:
        metrics_exporter: prometheus
        latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
        dimensions:
          - name: http.method
            default: GET
          - name: http.status_code
        dimensions_cache_size: 1000

    extensions:
      health_check:
      pprof:
      zpages:
      sigv4auth:
        service: 'aps'
        region:  'eu-central-1' 

    service:
      telemetry:
        logs:
          level: "debug"
      extensions: [sigv4auth]
      pipelines:
        metrics:
          receivers: [otlp,awsfirehose]
          processors: []
          exporters: [prometheusremotewrite]

Log output

2022-12-02T15:25:51.453Z    error    awsfirehosereceiver@v0.66.0/receiver.go:164    Invalid Firehose request    {"kind": "receiver", "name": "awsfirehose", "pipeline": "metrics", "error": "missing request id in header"}
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsfirehosereceiver.(*firehoseReceiver).ServeHTTP
    github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsfirehosereceiver@v0.66.0/receiver.go:164
go.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1
    go.opentelemetry.io/collector@v0.66.0/config/confighttp/compression.go:162
net/http.HandlerFunc.ServeHTTP
    net/http/server.go:2109
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP
    go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.36.4/handler.go:204
go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP
    go.opentelemetry.io/collector@v0.66.0/config/confighttp/clientinfohandler.go:39
net/http.serverHandler.ServeHTTP
    net/http/server.go:2947
net/http.(*conn).serve
    net/http/server.go:1991

Additional context

No response

jefchien commented 1 year ago

The AWS Firehose Receiver expects requests to follow the specifications. The first error you saw "missing request id in header" is because the receiver expects a X-Amz-Firehose-Request-Id in the header of the request. Not certain why this wasn't the case if it's being sent from Kinesis. The second is because the receiver was unable to unmarshal the request body. Do you have a sample of the manual request you were trying to send?

github-actions[bot] commented 1 year ago

Pinging code owners for receiver/awsfirehose: @Aneurysm9. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

jdnurmi commented 1 year ago

For anyone else chasing this - I actually discovered that a number of these were from the health-checking of my load balancers

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 1 year ago

This issue has been closed as inactive because it has been stale for 120 days with no activity.