open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/docs/languages/go
Apache License 2.0
5.28k stars 1.08k forks source link

Buckets should be defined at instrument level #689

Closed moorara closed 2 years ago

moorara commented 4 years ago

Context

Currently, when using Prometheus exporter, buckets are set when creating and initializing a Meter and we have to know ahead of time if we want buckets to use Int64 boundaries or Float64 boundaries.

Problem

The boundaries we define is a function of the instrument we measure. Depending on the type of instrument and what the instrument measures, we need to define different buckets. So, we should be able to define buckets per instrument basis.

When defining buckets at Meter level, it means:

And since Meter is supposed to be singleton based on OpenTelemetry specifications, I can make a similar argument that quantiles should also be defined at instrument level. We may not necessary want the same quantiles for every instrument that uses summary aggregation.

jmacd commented 4 years ago

There are several questions and answers here.


OpenTelemetry Metrics is explicitly moving away from instruments named for the action they perform, and moving toward instruments named for how they are used. There is no Histogram instrument, which is part of the reason buckets can't be configured at the instrument level. In the forthcoming 0.4 metrics specification, there will be two instruments that will translate into Prometheus histograms by default, those are going to be named ValueRecorder and ValueObserver.


I'm not sure I understand what you mean about Int64 vs Float64. The choice here is for what kind of numbers you want to put in, there is not a semantic difference, only the domains are different in a way that does not meaningfully affect the default aggregation. If you have both int64 and float64 measurements, it would be appropriate to use a float64 instrument. It looks like Prometheus supports only float64 bucket boundaries. https://godoc.org/github.com/prometheus/client_golang/prometheus#HistogramOpts


About defining bucket boundaries, our intention is to support a Views API for configuring specific outputs based on the metric name and exporter, such as specific label dimensions, histogram boundaries or other aggregation options. See here: https://github.com/open-telemetry/oteps/pull/89. I'm personally expecting to prototype a Views API in the Go SDK in May, after OpenTelemetry releases the 0.4 specification.

To get us to OpenMetrics compatibility, we should have support for linear, log-linear, and arbitrary bucket configurations for histogram aggregators. We've discussed this in the past in the context of the Prometheus exporter: https://github.com/open-telemetry/opentelemetry-go/issues/534

That being said, OpenTelemetry is also pushing away from the use of fixed histogram boundaries. We have better options available today--DDSketch, T-digest, and other mergeable summarization techniques--which work without explicit bucket configuration. These are better options than explicit percentiles as well.


So I think the Views API proposed above is the answer, but just a technical note:

Meter is supposed to be singleton based on OpenTelemetry specifications

It is true that there is a singleton global instance, it is provided as a convenience which allows metric instruments to be configured statically. (Note that Prometheus clients use a global registry, with the same benefit.) The default SDK can be used without being installed as the global. Multiple can co-exist in a process.

tony612 commented 2 years ago

Is there any way to define custom buckets for different instruments now? I still didn't find a way to do this, and https://github.com/open-telemetry/opentelemetry-go/issues/1280 is not resolved.

MrAlias commented 2 years ago

Is there any way to define custom buckets for different instruments now? I still didn't find a way to do this, and #1280 is not resolved.

The revised metric SDK being merged in https://github.com/open-telemetry/opentelemetry-go/pull/3175 contains views. They be used with the new SDK to override histogram bounds per matching criteria (including by instrument name).

Please give them a look. They will be included in the next metric SDK release (expected in a few days).

tony612 commented 2 years ago

@MrAlias I didn't find the method to apply view to the instrument. I can create a view and call TransformInstrument successfully but don't know how to change meter.SyncFloat64().Histogram. Maybe an example can be provided in https://github.com/open-telemetry/opentelemetry-go/blob/main/example/prometheus/main.go ?

My view is like:

    v, err := view.New(
        view.MatchInstrumentName("my_data_bytes"),
        view.MatchInstrumentKind(view.SyncHistogram),
        view.MatchInstrumentationScope(instrumentation.Scope{
            Name: meterName,
        }))

    newInst, match := v.TransformInstrument(view.Instrument{
        Name:        "my_data_bytes",
        Scope:       instrumentation.Scope{Name: meterName},
        Kind:        view.SyncHistogram,
        Aggregation: aggregation.ExplicitBucketHistogram{Boundaries: []float64{128, 512, 2048, 8192, 32768, 65536, 262144, 1048576}},
    })

And this is my instrument:

    i, err  = metric.Meter.SyncInt64().Histogram("my_data_bytes",
        instrument.WithUnit(unit.Bytes),
    )
tony612 commented 2 years ago

Ah, I make it work. Thanks!

btw, I noticed two breaking changes:

  1. Prometheus exporter needs go >= 1.18
  2. . in names will be changed to _ before, but this changes now. And I got error like
    error registering collector: descriptor Desc{fqName: "process.runtime.go.mem.heap_idle", help: "Bytes in idle (unused)        spans", constLabels: {}, variableLabels: []} is invalid: "process.runtime.go.mem.heap_idle" is not a valid metric name

Are these expected?

tony612 commented 2 years ago

And I added an example in prometheus https://github.com/open-telemetry/opentelemetry-go/pull/3177 But I found view.WithRename will create a new metric and not delete the old one. Is this a bug? The example will output

$ curl -s localhost:2222/metrics | grep histogram | grep "#"
# HELP baz a very nice histogram
# TYPE baz histogram
# HELP custom_histogram a histogram with custom buckets and rename
# TYPE custom_histogram histogram
# HELP qux a histogram with custom buckets and rename
# TYPE qux histogram

Besides, a defaultView is needed, otherwise other metrics will be gone. This is a little inconvenient for me.

MrAlias commented 2 years ago

Ah, I make it work. Thanks!

btw, I noticed two breaking changes:

  1. Prometheus exporter needs go >= 1.18
  2. . in names will be changed to _ before, but this changes now. And I got error like
error registering collector: descriptor Desc{fqName: "process.runtime.go.mem.heap_idle", help: "Bytes in idle (unused)        spans", constLabels: {}, variableLabels: []} is invalid: "process.runtime.go.mem.heap_idle" is not a valid metric name

Are these expected?

This was indeed a bug https://github.com/open-telemetry/opentelemetry-go/issues/3183

It should be resolved in v0.32.1

MrAlias commented 2 years ago

And I added an example in prometheus #3177 But I found view.WithRename will create a new metric and not delete the old one. Is this a bug? The example will output

$ curl -s localhost:2222/metrics | grep histogram | grep "#"
# HELP baz a very nice histogram
# TYPE baz histogram
# HELP custom_histogram a histogram with custom buckets and rename
# TYPE custom_histogram histogram
# HELP qux a histogram with custom buckets and rename
# TYPE qux histogram

Besides, a defaultView is needed, otherwise other metrics will be gone. This is a little inconvenient for me.

This is also a bug. You can track its resolution in https://github.com/open-telemetry/opentelemetry-go/issues/3224

MrAlias commented 2 years ago

Closed by #3177