open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.07k stars 2.37k forks source link

filter processor for metrics sending non matching values to exporter #30176

Closed meSATYA closed 10 months ago

meSATYA commented 10 months ago

Component(s)

processor/filter

What happened?

Description

I am trying to filter metrics where a metric having a specific label and non-empty values should be filtered and send to the exporters. But, unfortunately it also sends label values having null as well. It detects right metric names having those labels, but can't ignore null values.

Steps to Reproduce

Filter configurations I have tried is mentioned in the OpenTelemetry Collector configuration section.

Expected Result

Filter processor should only filter metrics having label "my.last.name" and value should be not null.

Actual Result

Filter processor sending metrics with null values as well.

Collector version

v0.87.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04") Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

Filter configurations I have tried:

1. filter/egress-otel:
      metrics:
        include:
          match_type: expr
          expressions:
          - Label("my.last.name") == "rout" # didn't work, sending null values as well
 2. filter/egress-otel:
      metrics:
        include:
          match_type: expr
          expressions:
          - Label("my.last.name") != nil # didn't work, sending null values as well
 3. filter/egress-otel:
      metrics:
        include:
          metric_names:
              - .*
          match_type: expr
          expressions:
          - Label("service.name") == "passport" && Label("my.last.name") != nil # didn't work, sending null values as well
  4. filter/egress-otel:
        metrics:
          include:
            match_type: regexp
            metric_names:
            - .*
            resource_attributes:
            - key: service.name
              value: passport
            - key: my.last.name
              value: rout                   # didn't work, not filtering any metrics

Log output

No response

Additional context

No response

github-actions[bot] commented 10 months ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

meSATYA commented 10 months ago

Has anyone else faced this issue?

meSATYA commented 10 months ago

@MacroPower @TylerHelmuth Could you please check if you can reproduce this issue at your end?

vaibhhavv commented 10 months ago

Hi @jpkrohling, @austinlparker, can you guide here or may be someone you know who can guide better here? Actually we are stuck in middle of a use case and need to filter down the data over OpenTelemetry and forward it, but seems like filter process is not working as per the requirements.

TylerHelmuth commented 10 months ago

I am still on holiday break but will try to get to this issue next week.

meSATYA commented 10 months ago

Thanks @TylerHelmuth for your response. As further tested, it is observed that though filterprocessor matches correct label, it not only sends the timeseries having that label, but also sends other timeseries where that label is not present. It seems the filterprocessor works on metric names, but not on the timeseries. Is it possible to export only those timeseries having matching labels?

For example:

      filter/egress-metrics:
        metrics:
          include:
            match_type: expr  
            expressions:
              - Label("my.last.name") != ""

The output is exported to Mimir and as you can see the result in the screenshot, there are 3 timeseries are exported, where as the expectation from the filterprocessor is that it should only export the 3rd timeseries where the Label("my.last.name") is "chaplin". It's great to have your opinion on this.

TylerHelmuth commented 10 months ago

I want to clarify first that metrics do not have attributes (what prometheus calls labels). Metrics have datapoints and datapoints have attributes.

Based on your description I'm going to assume you want to KEEP data based on a datapoint attribute and DROP (filter) all other data.

I highly encourage you to try using the filterprocessor's preferred configuration option that uses OTTL. You can see some examples here of how OTTL allows checking for the presence or absence of an attribute.

I'm not totally understanding your exact use case, but if you want to KEEP a metric where ANY datapoint in that metric contains a specific attribute and DROP all other metrics you can do:

processors:
  filter/ottl:
    error_mode: ignore
    metrics:
      metric:
          - HasAttrKeyOnDatapoint("my.last.name")

This configuration uses the HasAttrKeyOnDatapoint function to check if any datapoint in the metric has an attribute named "my.last.name"

meSATYA commented 10 months ago

Hi @TylerHelmuth , thanks for clarifying on otel metrics model. The requirement is to forward only those timeseries where "my.last.name" has some value and not null and simultaneously ignore all other metrics.

I have tried testing on opentelemetry-collector latest tag version v0.91.0 with the below json payload and in the following screenshots you can see the results of this test cases. The behavior is completely opposite to the requirement.

Fyi, in the above 3 test cases the json payload is mostly same and only timestamps were changed each time

Test case 1: JSON Payload: { "resourceMetrics":[ { "resource":{ "attributes":[ { "key":"service.name", "value":{ "stringValue":"bp1" } } ] }, "scopeMetrics":[ { "scope":{ }, "metrics":[ { "name":"cpuu-int", "unit":"1", "sum":{ "dataPoints":[ { "attributes":[ { "key":"name1", "value":{ "stringValue":"v1" } } ], "startTimeUnixNano":"1704810109000000000", "timeUnixNano":"1704810109000000000", "asInt":"239" }, { "attributes":[ { "key":"name2", "value":{ "stringValue":"v2" } } ], "startTimeUnixNano":"1704810109000000000", "timeUnixNano":"1704810109000000000", "asInt":"569" }, { "attributes":[ { "key":"my.last.name", "value":{ "stringValue":"chaplin" } } ], "startTimeUnixNano":"1704810109000000000", "timeUnixNano":"1704810109000000000", "asInt":"369" } ], "aggregationTemporality":2, "isMonotonic":true } }, { "name":"memory-int", "unit":"1", "sum":{ "dataPoints":[ { "attributes":[ { "key":"name3", "value":{ "stringValue":"v3" } } ], "startTimeUnixNano":"1704810109000000000", "timeUnixNano":"1704810109000000000", "asInt":"659" }, { "attributes":[ { "key":"name4", "value":{ "stringValue":"v4" } } ], "startTimeUnixNano":"1704810109000000000", "timeUnixNano":"1704810109000000000", "asInt":"769" } ], "aggregationTemporality":2, "isMonotonic":true } } ] } ] } ] }

Otel filter configuration: image

Data exported to Mimir: image

image

Test Case 2: JSON Payload: { "resourceMetrics":[ { "resource":{ "attributes":[ { "key":"service.name", "value":{ "stringValue":"bp1" } } ] }, "scopeMetrics":[ { "scope":{ }, "metrics":[ { "name":"cpuu-int", "unit":"1", "sum":{ "dataPoints":[ { "attributes":[ { "key":"name1", "value":{ "stringValue":"v1" } } ], "startTimeUnixNano":"1704810972000000000", "timeUnixNano":"1704810972000000000", "asInt":"239" }, { "attributes":[ { "key":"name2", "value":{ "stringValue":"v2" } } ], "startTimeUnixNano":"1704810972000000000", "timeUnixNano":"1704810972000000000", "asInt":"569" }, { "attributes":[ { "key":"my.last.name", "value":{ "stringValue":"chaplin" } } ], "startTimeUnixNano":"1704810972000000000", "timeUnixNano":"1704810972000000000", "asInt":"369" } ], "aggregationTemporality":2, "isMonotonic":true } }, { "name":"memory-int", "unit":"1", "sum":{ "dataPoints":[ { "attributes":[ { "key":"name3", "value":{ "stringValue":"v3" } } ], "startTimeUnixNano":"1704810972000000000", "timeUnixNano":"1704810972000000000", "asInt":"659" }, { "attributes":[ { "key":"name4", "value":{ "stringValue":"v4" } } ], "startTimeUnixNano":"1704810972000000000", "timeUnixNano":"1704810972000000000", "asInt":"769" } ], "aggregationTemporality":2, "isMonotonic":true } } ] } ] } ] }

Otel filter configuration: image

Data exported to Mimir: image image

Test Case 3: JSON Payload: { "resourceMetrics":[ { "resource":{ "attributes":[ { "key":"service.name", "value":{ "stringValue":"bp1" } } ] }, "scopeMetrics":[ { "scope":{ }, "metrics":[ { "name":"cpuu-int", "unit":"1", "sum":{ "dataPoints":[ { "attributes":[ { "key":"name1", "value":{ "stringValue":"v1" } } ], "startTimeUnixNano":"1704830258000000000", "timeUnixNano":"1704830258000000000", "asInt":"239" }, { "attributes":[ { "key":"name2", "value":{ "stringValue":"v2" } } ], "startTimeUnixNano":"1704830258000000000", "timeUnixNano":"1704830258000000000", "asInt":"569" }, { "attributes":[ { "key":"my.last.name", "value":{ "stringValue":"chaplin" } } ], "startTimeUnixNano":"1704830258000000000", "timeUnixNano":"1704830258000000000", "asInt":"369" } ], "aggregationTemporality":2, "isMonotonic":true } }, { "name":"memory-int", "unit":"1", "sum":{ "dataPoints":[ { "attributes":[ { "key":"name3", "value":{ "stringValue":"v3" } } ], "startTimeUnixNano":"1704830258000000000", "timeUnixNano":"1704830258000000000", "asInt":"659" }, { "attributes":[ { "key":"name4", "value":{ "stringValue":"v4" } } ], "startTimeUnixNano":"1704830258000000000", "timeUnixNano":"1704830258000000000", "asInt":"769" } ], "aggregationTemporality":2, "isMonotonic":true } } ] } ] } ] }

Otel filter configuration: image

Data exported to Mimir: image image

The expectation is that otel should DROP "memory-int" metric and should have forwarded "cpuu-int" metric where the label is not null, but otel is not behaving in that way.

Could you please check if my understanding is correct and otel is behaving as expected?

TylerHelmuth commented 10 months ago

and simultaneously ignore all other metrics.

Do you mean it should drop all other metrics? The processor cannot ignore situations - it either keeps the telemetry, forwarding it on to the next component, or drops it.

meSATYA commented 10 months ago

I mean that the processor should only forward the metrics as per the filter and drop all other metrics.

TylerHelmuth commented 10 months ago

The expectation is that otel should DROP "memory-int" metric and should have forwarded "cpuu-int" metric where the label is not null, but otel is not behaving in that way.

Looking at the payload I agree with this statement. Can you share the full collector config you're testing with?

meSATYA commented 10 months ago

Sure, no problem. Below config is for the TC 3, and as obvious I was changing the filter accordingly for the test case 1 & 2 mentioned above.

config:
  exporters:
    debug: {}
    logging: {}
    prometheusremotewrite/postman:
        endpoint: http://mimir-nginx.mimir.svc:80/api/v1/push
        headers:
          x-scope-orgid: postman-metrics-testing
        resource_to_telemetry_conversion:
          enabled: true
        timeout: 30s
        tls:
          insecure: true  

  extensions:
    health_check: {}
    memory_ballast: {}

  receivers:
    otlp:
      protocols:
        http:
          endpoint: 0.0.0.0:4318
          cors:
              allowed_origins:
                - "http://*"
                - "https://*"
  processors:
    batch: {}
    filter/ottl:
        error_mode: ignore
        metrics:
          metric:
            - HasAttrOnDatapoint("my.last.name", "")
    memory_limiter: null

  service:
    telemetry:
      metrics:
        address: 0.0.0.0:8888
    extensions:
      - health_check
      - memory_ballast
    pipelines:
      metrics:
        receivers: 
        - otlp
        processors: 
        - memory_limiter
        - batch
        - filter/ottl
        exporters: 
        - debug
        - prometheusremotewrite/postman
      logs:
        receivers: 
        - otlp
        processors: 
        - memory_limiter
        - batch
        exporters: 
        - logging
      traces:
        receivers: 
        - otlp
        processors: 
        - memory_limiter
        - batch
        exporters: 
        - logging
TylerHelmuth commented 10 months ago

Can you try setting verbosity: detailed for the debug exporter and report what metrics you see?

TylerHelmuth commented 10 months ago

Oh I see the issue. HasAttrOnDatapoint("my.last.name", "") is checking that the metric has a datapoint with a key called my.last.name with a value of "". That is never true, so all the data is kept.

You want: not HasAttrKeyOnDatapoint("my.last.name"). This will result in metrics without at least one datapoint with a attribute key called my.last.name to be dropped. If a metric has a datapoint with a datapoint key called my.last.name it will be kept. OTel specification doesn't allow attributes to have nil values, so checking for the presence of the key is enough - if the key exists it has a non-nil value.

meSATYA commented 10 months ago

Thanks for sharing the negation option. I tried with not HasAttrKeyOnDatapoint("my.last.name") and it worked. It selected the right metric to forward. But, here it is a challange that out of 3 time series in metric "cpuu-int", we need only that time series where the attribute my.last.name is present. Let's think a use case, where millions of time series is coming for a single metric and we want to filter out that time series data where this attribute key is present. Could you please tell if this feature is available in opentelemetry?

image
meSATYA commented 10 months ago

Also, I believe not HasAttrKeyOnDatapoint("my.last.name") condition is similar to the condition with expression Label(my.last.name) != "". By using the later expression condition also otel picks up the "cpuu-int" metric and forwards all 3 time series.

filter/filterwithlabel:
  metrics:
    include:
      expressions:
        - Label("my.last.name") != ""
            match_type: expr
TylerHelmuth commented 10 months ago

But, here it is a challange that out of 3 time series in metric "cpuu-int", we need only that time series where the attribute my.last.name

This is actually easier. If you only want to keep datapoints that contain the attribute "my.last.name" you'd do

processors:
  filter/ottl:
    error_mode: ignore
    metrics:
      datapoint:
          - attributes["my.last.name"] == nil

That will drop any datapoint that does not have the required attribute. If all datapoints in a metric are dropped, the entire metric also gets dropped (since metrics with no datapoints are not allowed).

meSATYA commented 10 months ago

But, here it is a challange that out of 3 time series in metric "cpuu-int", we need only that time series where the attribute my.last.name

This is actually easier. If you only want to keep datapoints that contain the attribute "my.last.name" you'd do

processors:
  filter/ottl:
    error_mode: ignore
    metrics:
      datapoint:
          - attributes["my.last.name"] == nil

That will drop any datapoint that does not have the required attribute. If all datapoints in a metric are dropped, the entire metric also gets dropped (since metrics with no datapoints are not allowed).

Many thanks @TylerHelmuth. This filtering option is now filtering at datapoint level and only forwarding single timeseries of the metric and dropping rest.

image