open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.02k stars 2.33k forks source link

Bug in `metricstransform` when using `experimental_match_labels` #17530

Closed sfc-gh-aivanou closed 1 year ago

sfc-gh-aivanou commented 1 year ago

Component(s)

metricstransform

What happened?

Description

experimental_match_labels does not handle / correctly.

Steps to Reproduce

Run transform that contains labels with / value:

      metricstransform/prod-node:
        transforms:

            - include: node_filesystem_free_bytes
              action: update
              new_name: node.filesystem.root.free.bytes
              experimental_match_labels: {"mountpoint": "/"}
              match_type: strict

Expected Result

All labels that contain mountpoint = "/" should be updated to a new metric with name node.filesystem.root.free.bytes

Actual Result

metricstransform unable to match /

Collector version

otel/opentelemetry-collector-contrib:latest

Environment information

Environment

OS: (e.g., "Ubuntu 20.04") Compiler(if manually compiled): (e.g., "go 14.2")

Image: otel/opentelemetry-collector-contrib:latest

OpenTelemetry Collector configuration

metricstransform/prod-node:
        transforms:
            - include: node_filesystem_free_bytes
              action: update
              new_name: node.filesystem.root.free2.bytes
              experimental_match_labels: {"mountpoint": "/"}
              match_type: strict

Log output

No response

Additional context

No response

github-actions[bot] commented 1 year ago

Pinging code owners for processor/metricstransform: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself.

dmitryax commented 1 year ago

@sfc-gh-aivanou thanks for reporting. Can you please provide a complete configuration that can be used to reproduce the issue?

andrzej-stencel commented 1 year ago

@sfc-gh-aivanou I think the problem here is not the slash character /. In my testing, the processor handles it without issues.

I think the problem is that you're trying to update the name of the metric when you should rather be creating a new metric. To verify this, change the action: update to action: insert and see if the new metric node.filesystem.root.free2.bytes gets created. If it does, this will prove that the experimental_match_labels: {"mountpoint": "/"} configuration works correctly.

Note that you cannot always use action: update when using experimental_match_labels. The general rule is, you can only use action: update when the matching conditions specified in your configuration match all the data points in the metric. Specifically, if your input metric node_filesystem_free_bytes has data points with different values of the mountpoint label, the whole metric cannot be renamed.

Here's an example: suppose your input metric looks like this when logged with the logging exporter:

2023-01-16T22:03:22.103+0100    info    MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "#metrics": 1}
2023-01-16T22:03:22.103+0100    info    ResourceMetrics #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.9.0
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope ...
Metric #0
Descriptor:
     -> Name: node_filesystem_free_bytes
     ...
NumberDataPoints #0
Data point attributes:
     -> mountpoint: Str(/)
StartTimestamp: 2023-01-16 20:09:09 +0000 UTC
Timestamp: 2023-01-16 21:03:22.102860682 +0000 UTC
Value: 111
NumberDataPoints #1
Data point attributes:
     -> mountpoint: Str(/boot)
StartTimestamp: 2023-01-16 20:09:09 +0000 UTC
Timestamp: 2023-01-16 21:03:22.102860682 +0000 UTC
Value: 222

This is one metric node_filesystem_free_bytes with two data points in it. The Prometheus representation would be something like this:

node_filesystem_free_bytes{"mountpoint"="/"} 111
node_filesystem_free_bytes{"mountpoint"="/boot"} 222

but in OTLP data structure, both data points share the same metric name. This is the reason why you cannot rename the metric - the metric name is shared with data points with values of mountpoint other than /.

I hope this makes sense and @dmitryax please correct me if I'm wrong.😅

sfc-gh-aivanou commented 1 year ago

Thank you @astencel-sumo ! It works if I insert new metric, but I wonder if it should also work with update action?

This works without experimental_match_labels parameter:

            - include: node_filesystem_free_bytes
              action: update
              new_name: node.filesystem.root.free3.bytes
              match_type: strict

But when I add experimental_match_labels, only insert action works.

Here is the full config:

apiVersion: v1
kind: ConfigMap
metadata:
  name: collector
  namespace: system-metrics
data:
  config.yaml: |
    receivers:
      prometheus:
        config:
          scrape_configs:

            - job_name: 'node-exporter'
              scrape_interval: 90s
              kubernetes_sd_configs:
                - role: endpoints

              metric_relabel_configs:
              - source_labels: [ __name__ ]
                regex: '(node_filesystem_free_bytes|node_cpu_seconds_total|node_load5|node_memory_MemAvailable_bytes|node_memory_MemTotal_bytes)'
                action: keep

              relabel_configs:
              - source_labels: [__address__]
                regex: ^(.*):\d+$
                target_label: __address__
                replacement: $$1:9100
              - source_labels: [__meta_kubernetes_endpoints_name]
                regex: 'node-exporter'
                action: keep
              - source_labels: [__meta_kubernetes_endpoints_label_type]
                target_label: TYPE

    processors:
      metricstransform/node:
        transforms:
            - include: node_filesystem_free_bytes
              action: update
              new_name: node.filesystem.root.free4.bytes
              experimental_match_labels: {"mountpoint": "/"}
              match_type: strict

    extensions:
      zpages: {}
      memory_ballast:
        size_mib: 4500
    exporters:
      prometheus:
        endpoint: "0.0.0.0:9001"
        resource_to_telemetry_conversion:
          enabled: true
      logging:
        loglevel: info
        sampling_initial: 5
        sampling_thereafter: 200

    service:
      telemetry:
        logs:
          level: "info"
      extensions: [zpages, memory_ballast]

      pipelines:
        metrics/1:
          receivers: [prometheus]
          processors: [
            metricstransform/node
          ]
          exporters: [prometheus]
github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

andrzej-stencel commented 1 year ago

Yes, your example confirms the statements from my comment. You cannot always use action: update when using experimental_match_labels.

I wonder if it should also work with update action?

As described in my comment above - it can, but only if all the data points in the renamed metric match the condition specified in the experimental_match_labels property.

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 1 year ago

This issue has been closed as inactive because it has been stale for 120 days with no activity.