open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.05k stars 2.35k forks source link

Filtering metrics by label/attribute (via regex) #36038

Open LilWatson opened 2 days ago

LilWatson commented 2 days ago

Component(s)

processor/filter

What happened?

Description

My original goal was to include only metrics with the label namespace starting with a given string. Since this did not work i tried filtering with a static string, which also gave me problems.

Steps to Reproduce

kind create cluster install mimir (chart default values) install kube-state-metrics exporter (chart default values)

# kustomize.yaml with 
helmCharts:
# https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-collector
- name: opentelemetry-collector
  releaseName: tenant
  valuesInline:
    mode: deployment
    image:
      repository: otel/opentelemetry-collector-contrib
    config:
      extensions:
        health_check:
          endpoint: ${env:MY_POD_IP}:13133
      receivers:
        prometheus:
          config:
            scrape_configs:
              - job_name: opentelemetry-collector
                scrape_interval: 10s
                static_configs:
                  - targets:
                    # ClusterIP of kube-state-metrics
                    - 10.96.87.132:8080
      processors:
        # Based on https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.96.0/processor/filterprocessor/README.md#alternative-config-options
        filter:
          metrics:
            include:
              match_type: regexp
              resource_attributes:
                - key: namespace
                  value: graf.*
      exporters:
        debug:
          verbosity: normal
        prometheusremotewrite:
          endpoint: http://mimir-nginx.mimir.svc:80/api/v1/push
          headers:
            X-Scope-OrgID: my-project
      service:
        extensions:
          - health_check
        pipelines:
          metrics:
            receivers: [prometheus]
            processors: [filter]
            exporters: [prometheusremotewrite, debug]
  version: 0.108.0
  repo: https://open-telemetry.github.io/opentelemetry-helm-charts

Expected Result

The original config should include only metrics/timeseries with a namespace label starting with "graf". The additional filters i have tried (see configuration section) should exclude timeseries with a namespace label set to "grafana".

Actual Result

The original config result also includes metrics/timeseries with a namespace label not starting with "graf". The additional filters result includes timeseries with a namespace label set to "grafana".

Collector version

v0.111.0

Environment information

Environment

OS: macOS 14.0 (23A344)

OpenTelemetry Collector configuration

extensions:
  health_check:
    endpoint: ${env:MY_POD_IP}:13133
receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: opentelemetry-collector
          scrape_interval: 10s
          static_configs:
            - targets:
              - 10.96.87.132:8080
processors:
  # Based on https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.96.0/processor/filterprocessor/README.md#alternative-config-options
  filter:
    metrics:
      include:
        match_type: regexp
        resource_attributes:
          - key: namespace
            value: graf.*
    exporters:
  debug:
    verbosity: normal
  prometheusremotewrite:
    endpoint: http://mimir-nginx.mimir.svc:80/api/v1/push
    headers:
      X-Scope-OrgID: my-project
service:
  extensions:
    - health_check
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [filter]
      exporters: [prometheusremotewrite, debug]

# Additional filter processors

# Based on https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/filterprocessor/README.md#examples
filter:
  metrics:
    datapoint:
    - 'resource.attributes["namespace"] == "grafana"'

# Based on https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/filterprocessor/README.md#hasattrondatapoint
filter:
  metrics:
    metric:
      - 'HasAttrOnDatapoint("namespace", "grafana")'

# Based on https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.96.0/processor/filterprocessor/README.md#using-an-expr-match_type
filter:
  metrics:
    exclude:
      match_type: expr
        expressions:
        - Label("namespace") == "grafana"

Log output

# Original config
k logs -l "app.kubernetes.io/instance=tenant" -n otel-col | grep -v namespace=grafana
kube_daemonset_status_number_available{daemonset=kube-proxy,namespace=kube-system} 1

# Additional filter processors
k logs -l "app.kubernetes.io/instance=tenant" -n otel-col | grep namespace=grafana
kube_pod_container_status_running{container=grafana,namespace=grafana,pod=grafana-684d8b87db-7zklg,uid=bf6ffb23-746b-476b-8e95-fd6a26da572a} 1

Additional context

The Include/Exclude filter config featuring regex examples was removed from the readme. While using it there was no configuration error nor warning message. Is this still supported? If not what is the current way to filter metric labels/attributes via regex?

Thanks in Advance! :)

github-actions[bot] commented 2 days ago

Pinging code owners: