open-telemetry / opentelemetry-proto

OpenTelemetry protocol (OTLP) specification and Protobuf definitions
https://opentelemetry.io/docs/specs/otlp/
Apache License 2.0
576 stars 253 forks source link

Add gauge subtype for prometheus unknown type #496

Closed dashpole closed 8 months ago

dashpole commented 1 year ago

Part of https://github.com/open-telemetry/opentelemetry-specification/issues/3058

This an attempt at defining a "subtype" for OTel metric types, as suggested at the 2/7 Spec SIG meeting.

This is the Proposed solution described in the proposal here: https://github.com/open-telemetry/opentelemetry-specification/issues/3058#issuecomment-1622740102.

jsuereth commented 1 year ago

Should we split the discussion between "Unknown", "Info" and "StatefulSet" metrics?

I'd be more comfortable if we evaluated each change to the protocol individually. E.g. I'm not convinced of the encoding of Info/StatefulSet but I do think encoding Unknown is ok.

tigrannajaryan commented 1 year ago

@dashpole this also requires an "interoperability" explanation: how the new senders and old receivers should work such that nothing is broken.

tigrannajaryan commented 1 year ago

@dashpole one more thing that needs to be explored is what are the metric processors in the Collector supposed to do with the payloads that have these new fields set? If the processor is unaware of the new fields is existing processing logic still valid?

When the Collector is updated to the new version of OTLP the fields will become available in the Collector and will be correctly passed through. However if a processor that knows nothing about these fields transforms a metric can it be a problem?

dashpole commented 1 year ago

this also requires an "interoperability" explanation: how the new senders and old receivers should work such that nothing is broken.

If a new sender sends the new field, but the receiver does not know about it, ignoring it at the receiver is safe. This preserves the existing behavior today without the field.

If an old sender sends data without the new field to a new receiver, using the default value of NONE subtypes is safe. It preserves existing behavior today without the field.

what are the metric processors in the Collector supposed to do with the payloads that have these new fields set? If the processor is unaware of the new fields is existing processing logic still valid?

Processors themselves shouldn't need to do anything with this field, other than preserve it. Pdata's Gauge.CopyTo(), and Sum.CopyTo() would need to copy this field. Otherwise, I don't think any changes are required, and all existing processing logic should still be valid.

When the Collector is updated to the new version of OTLP the fields will become available in the Collector and will be correctly passed through. However if a processor that knows nothing about these fields transforms a metric can it be a problem?

It won't be a problem. One of the primary reasons why this subtype proposal is preferable to additional attributes is that it is entirely "opt-in" for existing users of OTLP, since the subtype is always safe to ignore unless you explicitly care about it.

dashpole commented 1 year ago

@jsuereth I updated the wording for info and stateset based on https://github.com/open-telemetry/opentelemetry-specification/issues/3058#issuecomment-1625428292 in the latest commit. LMK what you think. If you still have concerns, I can cut info and stateset from the proposal.

dashpole commented 1 year ago

This now only contains the gauge subtype for unknown. Given uncertainties around the future of OpenMetrics, it probably isn't the best idea to add those types right now.

This should resolve @jsuereth's concerns as well.

dashpole commented 1 year ago

can you add the details from your PR comment into the protocol as comments for interaction?

Done. Added:

  // When receiving from older producers of OTLP, it is safe to assume the
  // zero-value for sub_type, GAUGE_SUBTYPE_NONE. When sending OTLP to older
  // consumers, it is safe for them to ignore the field. In both cases, the
  // previous behavior without sub_type is maintained until producer and
  // consumer are aware of the field.
jack-berg commented 1 year ago

FWIW, I prefer the string approach proposed in this comment. I think the chances of collision are small, and I value the flexibility provided by string (i.e. no need to have lengthly discussions for each new proposal) more than I fear the collision potential of strings.

dashpole commented 1 year ago

@jack-berg I'm happy to make that change if there is consensus. I'll raise it tomorrow at the spec SIG meeting to see if there is consensus on the approach.

pirgeo commented 1 year ago

I understand that OTel will go ahead with adding these types. Keeping in mind the potential addition of a couple more metric types that can/should not be produced by the SDKs (https://github.com/open-telemetry/opentelemetry-specification/issues/3058#issuecomment-1616061089), has the usage of a separate metric type instead of using a subtype been discussed somewhere?

dashpole commented 8 months ago

closing in favor of https://github.com/open-telemetry/opentelemetry-proto/pull/514