open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/docs/languages/go
Apache License 2.0
5.18k stars 1.04k forks source link

Feedback about new Metrics API, propose MustMeter #3632

Open bogdandrutu opened 1 year ago

bogdandrutu commented 1 year ago

The new metrics API is much cleaner and more consistent with other languages, congrats on this new released API.

I would like to propose a "MustMeter" as a helper/contrib:


type MustMeter struct {
  delegate Meter
}

func NewMustMeter(delegate Meter) *MustMeter {
  return &MustMeter{delegate: delegate}
}

// Int64Counter returns a new instrument identified by name and configured
// with options. The instrument is used to synchronously record increasing
// int64 measurements during a computational operation.
func (mm *MustMeter) Int64Counter(name string, options ...instrument.Int64Option) instrument.Int64Counter {
  i, err := mm.delegate(name, options...)
  if err != nil {
    panic(err)
  }
  return i
}
MrAlias commented 1 year ago

Given the API is not expected to validate any of the arguments it receives for instrument creation (https://github.com/open-telemetry/opentelemetry-specification/pull/3171). It might make sense for the Meter's instrument creation methods to be updated to not return an error at all. The SDK or other implementations can use the otel.ErrorHandler to handle errors.

Currently, having the error be returned from these creation method means that instrumentation library authors need to "handle" all errors from API implementations (OTel SDK and others). There is really no way they can handle all these errors appropriately and for everybody. The operator of the fully constructed telemetry system is the only one that will know if they need to log these errors, panic, or something else. The mechanism we have for this is to use the otel.ErrorHandler (or just chose to log them).

MrAlias commented 1 year ago

Given the API is not expected to validate any of the arguments it receives for instrument creation (open-telemetry/opentelemetry-specification#3171). It might make sense for the Meter's instrument creation methods to be updated to not return an error at all. The SDK or other implementations can use the otel.ErrorHandler to handle errors.

Currently, having the error be returned from these creation method means that instrumentation library authors need to "handle" all errors from API implementations (OTel SDK and others). There is really no way they can handle all these errors appropriately and for everybody. The operator of the fully constructed telemetry system is the only one that will know if they need to log these errors, panic, or something else. The mechanism we have for this is to use the otel.ErrorHandler (or just chose to log them).

I'm looking into a prototype to explore this.

MrAlias commented 1 year ago

Notes from SIG meeting on #3737 proposal:

sk- commented 9 months ago

While there's no consensus on how to tackle this, may I suggest that we document in which cases a counter or histogram creation could fail. From what I've read in the code there are various cases when that can fail:

1) metric name is invalid (see function validateInstrumentName) https://github.com/open-telemetry/opentelemetry-go/blob/885210bf33238f8c6cc94c3f3711b0036fbe77b1/sdk/metric/meter.go#L295 2) histogram explicit boundaries are invalid https://github.com/open-telemetry/opentelemetry-go/blob/885210bf33238f8c6cc94c3f3711b0036fbe77b1/sdk/metric/aggregation.go#L125 3) failure to add instrument https://github.com/open-telemetry/opentelemetry-go/blob/885210bf33238f8c6cc94c3f3711b0036fbe77b1/sdk/metric/pipeline.go#L227

The first two are precondition checks that could easily be wrapped in a Must helper function by the caller, specially in cases where the parameters are static values. Assuming we are sure the values are correct (or the code is covered in tests).

On the other hand, the latter case (instrument creation) is much harder to understand in which conditions could fail. THe code only mentions

// If the passed instrument would result in an incompatible aggregate function,
// an error is returned and that aggregate function output is not inserted nor
// is its input returned.

Documenting under which conditions these calls could fail, would not only be useful to callers to decide whether it makes sense to ignore the error and use a Must like function. It would also useful for testing, as one would have clear indications on how to trigger such errors.

pellared commented 9 months ago

@sk- I think documenting here the acceptable and/or invalid arguments of the instrument factories would be helpful.

Do you want to work on it?

MrAlias commented 8 months ago

SIG meeting notes:

MrAlias commented 8 months ago

If you find this to be a valuable feature. Please leave a :+1: on the issue. We do not plan to add this without significant user desire.