open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.92k stars 2.28k forks source link

How to aggregate gauge data points to single value in processor ? #23052

Closed kostyaLem closed 1 year ago

kostyaLem commented 1 year ago

Component(s)

processor/metricstransform, processor/transform

Describe the issue you're reporting

Description:

I have a set of data points for each metric. I looked at all the processors and did not find the right one that would help to aggregate each metric. How to collapse each metric in scopeMetrics.metrics to single values ? Thanks for reading (I don't have Slack so I'm posting here) :D

Config.yaml:

receivers:
  otlp:
    protocols:
      http:
        endpoint: 0.0.0.0:4319

processors:
  batch:
    timeout: 25s
  groupbyattrs:
    keys:
      - service.name
      - service.version
      - service.instance.id
  metricstransform:
    transforms:
      - include: process.runtime.dotnet.gc.heap
        action: update
        operations:
          - action: toggle_scalar_data_type

exporters:
  file:
    path: /var/lib/data/collected.json

service:
  pipelines:
    metrics:
      receivers: [ otlp ]
      processors: [ batch, groupbyattrs, metricstransform ]
      exporters: [ file ]

ActualResult:

{
    "resourceMetrics": [
        {
            "resource": {
                "attributes": [
                    {
                        "key": "service.name",
                        "value": {
                            "stringValue": "DummyMicroservice"
                        }
                    },
                    {
                        "key": "service.version",
                        "value": {
                            "stringValue": "1.0.0"
                        }
                    },
                    {
                        "key": "service.instance.id",
                        "value": {
                            "stringValue": "b5e7e9b1-e297-45fd-a101-1becc1118ba6"
                        }
                    }
                ]
            },
            "scopeMetrics": [
                {
                    "metrics": [
                        {
                            "description": "GC Heap Size",
                            "gauge": {
                                "dataPoints": [
                                    {
                                        "asDouble": 5569712,
                                        "startTimeUnixNano": "1685741600647452000",
                                        "timeUnixNano": "1685741605618902700"
                                    },
                                    {
                                        "asDouble": 5610672,
                                        "startTimeUnixNano": "1685741600647452000",
                                        "timeUnixNano": "1685741606620596400"
                                    },
                                    {
                                        "asDouble": 5651632,
                                        "startTimeUnixNano": "1685741600647452000",
                                        "timeUnixNano": "1685741607627150400"
                                    },
                                    {
                                        "asDouble": 5774328,
                                        "startTimeUnixNano": "1685741600647452000",
                                        "timeUnixNano": "1685741610623289700"
                                    },
                                    {
                                        "asDouble": 5897208,
                                        "startTimeUnixNano": "1685741600647452000",
                                        "timeUnixNano": "1685741613618195300"
                                    },
                                    {
                                        "asDouble": 5165872,
                                        "startTimeUnixNano": "1685741600647452000",
                                        "timeUnixNano": "1685741601645867100"
                                    },
                                    {
                                        "asDouble": 5536944,
                                        "startTimeUnixNano": "1685741600647452000",
                                        "timeUnixNano": "1685741604627212100"
                                    }
                                ]
                            },
                            "name": "process.runtime.dotnet.gc.heap",
                            "unit": "By"
                        }
                    ],
                    "scope": {
                        "name": "OpenTelemetry.Instrumentation.Runtime",
                        "version": "0.1.0.1"
                    }
                }
            ]
        }
    ]
}

Want to get:

Singe value using avregare, sum, min or max functions for each metrics in scopeMetrics.

github-actions[bot] commented 1 year ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

TylerHelmuth commented 1 year ago

Potentially the combine or group actions in the metricstransformprocessor

kostyaLem commented 1 year ago

This doesn't look right. These actions do not support avg, min, max etc =(

TylerHelmuth commented 1 year ago

In that case it is likely no process in Contrib currently fits your needs. Either the transformprocessor or metricstransformprocessor would need updating. Can you provide more in-depth details about how you'd like to collapse scope metrics?

dmitryax commented 1 year ago

@kostyaLem, how did you get this data in the first place? It seems pretty unusual. Metric datapoints usually have the same timestamp with different attributes. That's what the metrics transform processor expects. But you have different timestamps, so you need time aggregation, which is more like backend responsibility. A collector component doing the time aggregation would require a state with a significant amount of memory allocated to it. We might introduce such component someday, but it's not a priority at this point.

kostyaLem commented 1 year ago

@TylerHelmuth @dmitryax I prepared a small diagram that describes how I imagine it :D image

Thx

dmitryax commented 1 year ago

The diagram doesn't say anything about aggregation type. It can be read as spatial aggregation, which is supported by the metrics transform processor. You need time aggregation over a predefined time window. It'll not be implemented any time soon as it's a backend concern, not data collection. We cannot just support metric objects with a set of different timestamp data points because it solves a very narrow use case. Again, can you let us know what Service you have sending this kind of data?

kostyaLem commented 1 year ago

The .NET service sends application performance metrics every minute (threads, cpu usage, memory and etc.). Ok. How to "reduce" the telemetry input stream using the metrics transform processor?

dmitryax commented 1 year ago

The .NET service sends application performance metrics every minute (threads, cpu usage, memory and etc.).

Do you use OTel .NET instrumentation?

Ok. How to "reduce" the telemetry input stream using the metrics transform processor?

I don't think the collector has anything to offer to "reduce" the size of such metrics other than drop some of them by name

kostyaLem commented 1 year ago

Do you use OTel .NET instrumentation?

Yes.

I don't think the collector has anything to offer to "reduce" the size of such metrics other than drop some of them by name

Okay. Thanks for taking the time !

TylerHelmuth commented 1 year ago

Producing less metrics at the source is also an option. The .NET auto-instrumentation supports an env var to disable instrumentation by library.

dmitryax commented 1 year ago

The instrumentation should allow you to increase the measurements interval to produce less data points