vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.47k stars 1.53k forks source link

Prometheus remote write does not strip empty labels #20799

Open bossm8 opened 2 months ago

bossm8 commented 2 months ago

A note for the community

Problem

The prometheus remote write spec enforces certain (label) policies documented here: remote write spec. This includes for example that label values/keys must not be empty. The prometheus_remote_write sink does however not ensure (nor document) this requirement. This will eventually result in lots of metrics being dropped as the error is not retriable:

{"host":"***","internal_log_rate_limit":true,"message":"Not retriable; dropping the request.","metadata":{"kind":"event","level":"ERROR","module_path":"vector::sinks::util::retries","target":"vector::sinks::util::retries"},"pid":2426227,"reason":"\"Http status: 409 Conflict\"","source_type":"internal_logs","vector":{"component_id":"argos_remote_write","component_kind":"sink","component_type":"prometheus_remote_write"}}

I found the issue while scraping the prometheus node-exporter. Which contains metrics with empty values like for example:

node_disk_device_mapper_info{device="dm-0",lv_layer="",lv_name="root",name="rhel-root",uuid="LVM-VMQXe9Dt2J1gHYh00P1cNjzJqtt27hpR9ItwMCb2RJ5ULqcXYdu3361Uf5nY4K9A",vg_name="rhel"} 1 1720177839717

The lv_layer value is empty but will still be sent be the sink.

I don't have the receiver logs unfortunately but I then stumbled across this issue and wrote a small transformation which removes all empty label values:

.tags = filter(object!(.tags)) -> |key, value| { value != "" && key != "" }

And since then the errors are gone. (Below is the not-working config, just add the line above to the transform source and it will work.

Configuration

sources:
  node_exporter:
    type: prometheus_scrape
    endpoints:
      - http://localhost:9100/metrics

transforms:
  argos_metrics_node_exporter:
    type: remap
    inputs:
      - node_exporter
    source: |
      .tags.job = "node-exporter"

sinks:
  argos_remote_write:
    type: prometheus_remote_write
    endpoint: https://<receive-url>/api/v1/receive
    healthcheck:
      enabled: false
    inputs: 
      - argos_metrics_*

Version

vector 0.39.0 (x86_64-unknown-linux-gnu 73da9bb 2024-06-17 16:00:23.791735272)

Debug Output

No response

Example Data

No response

Additional Context

No response

References

No response

jszwedko commented 2 months ago

Thanks for this detailed report @bossm8