vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
18.15k stars 1.6k forks source link

prometheus_remote_write miss tenant_id. 401 Unauthorized #19754

Open suslikas opened 9 months ago

suslikas commented 9 months ago

A note for the community

Problem

prometheus_remote_write miss tenant_id (X-Scope-OrgId header) during basic auth. As result not work health check and data can't be sended to Prometheus (Grafana Mimir)

Configuration

sinks:
      out_websec_mimir:
        type: prometheus_remote_write
        inputs: [ "trans_cf_mimir_json" ]
        endpoint: https://xxx.us-east-1.elb.amazonaws.com/api/v1/push
        auth:
          strategy: basic
          user: data
          password: $WEBSEC_MIMIR_PASS
        tls:
          verify_certificate: false
        healthcheck:
          enabled: true
        tenant_id: websec-logwrap

BTW, when I tested via Loki, health check work as expected. Data transfer no, because Mimir not Loki, but idea was test only health check, because they have same base.

      out_websec_mimir:
        type: loki
        inputs: [ "trans_cf_mimir_json" ]
        endpoint: https://xxx.us-east-1.elb.amazonaws.com
        path: /api/v1/push
        tenant_id: websec-logwrap
        auth:
          strategy: basic
          user: data
          password: $WEBSEC_MIMIR_PASS
        tls:
          verify_certificate: false
        healthcheck:
          enabled: true
        encoding:
          codec: text
        labels:
          source: vector

Version

Kubernetes, 0.35.0-debian

Debug Output

2024-01-30T17:05:59.668140Z DEBUG http: vector::internal_events::http_client: Sending HTTP request. uri=https://xxx.us-east-1.elb.amazonaws.com/api/v1/push method=GET version=HTTP/1.1 headers={"x-prometheus-remote-write-version": "0.1.0", "content-type": "application/x-protobuf", "content-encoding": "snappy", "authorization": Sensitive, "user-agent": "Vector/0.35.0 (x86_64-unknown-linux-gnu e57c0c0 2024-01-08 14:42:10.103908779)", "accept-encoding": "identity"} body=[empty]

2024-01-30T17:05:59.686489Z DEBUG http: vector::internal_events::http_client: HTTP response. status=401 Unauthorized version=HTTP/1.1 headers={"date": "Tue, 30 Jan 2024 17:05:59 GMT", "content-type": "text/html", "content-length": "179", "connection": "keep-alive", "server": "nginx/1.24.0", "www-authenticate": "Basic realm=\"Mimir\""} body=[179 bytes]

Sniffer with NC

listening on [any] 8888 ...
GET /api/v1/push HTTP/1.1
x-prometheus-remote-write-version: 0.1.0
content-type: application/x-protobuf
content-encoding: snappy
authorization: Basic ZGF...UA==
user-agent: Vector/0.35.0 (x86_64-unknown-linux-gnu e57c0c0 2024-01-08 14:42:10.103908779)
accept-encoding: identity

Example Data

No response

Additional Context

Auth with curl work fine

curl -I -u "data:Bou...YW" https://xxx.us-east-1.elb.amazonaws.com/api/v1/push -k -H "X-Scope-OrgId: websec-logwrap"
HTTP/2 405
date: Tue, 30 Jan 2024 09:55:39 GMT
server: nginx/1.24.0

References

No response

jszwedko commented 9 months ago

Thanks @suslikas . It sounds like we need to expose an option to set X-Scope-OrgId HTTP header on the Prometheus Remote Write sink. This seems like a feature request rather than a bug so I'll update the labels.

suslikas commented 9 months ago

From my point of view is a bug, because in docs wrote:

If set, a header named X-Scope-OrgID is added to outgoing requests with the value of this setting.

Not forget, without this header auth for multi-tenant prometheus not work, because auth request contain user+pass+tenant

But as you wish :)

jszwedko commented 9 months ago

Aha, I stand corrected, thanks @suslikas . I see the intent is for tenant_id to control that. It does seem to be implemented so there may be some bug.

suslikas commented 9 months ago

Fast solution with Haproxy

    sinks:
      out_websec_mimir:
        type: prometheus_remote_write
        inputs: [ "log_to_metric" ]
        endpoint: http://websec-logwrap-general-haproxy.websec-logwrap-general.svc.cluster.local/api/v1/push
    listen mimir-wrapper
      bind 0.0.0.0:80
      mode http
      balance roundrobin
      option httplog
      http-request add-header X-Scope-OrgId "websec-logwrap"
      http-request add-header X-Haproxy-Wrappwer "yes"
      http-request add-header Authorization "Basic ZG...A=="
      server mimir-aws-lb-1 xxx.us-east-1.elb.amazonaws.com:443 ssl verify none check
davinkevin commented 9 months ago

Same problem, mimir and vector involved. +1 on this

dekelpilli commented 9 months ago

Just adding that we're not experiencing this problem (with 0.35 or 0.36):

[sinks.prw]
type      = "prometheus_remote_write"
inputs    = [ "..." ]
endpoint  = "http://.../api/v1/push"
tenant_id = "tid"
healthcheck.enabled = false

Perhaps something with the auth headers is overriding the X-Scope-OrgId header