open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.46k stars 1.47k forks source link

[pdata] Inconsistent output of enum types String() methods #6251

Closed dmitryax closed 2 years ago

dmitryax commented 2 years ago

The pdata module has a few enum type definitions. Some of them defined in proto (2), while others are defined in pdata only (based on proto oneof message fields names) (1).

1. For enum types that are not defined in proto, we use a short version of the enum identifier for the output of String() method in:

e.g.:

MetricTypeNone -> "None"
MetricTypeGauge -> "Gauge"
MetricTypeSum -> "Sum"
MetricTypeHistogram -> "Histogram"
MetricTypeExponentialHistogram -> "ExponentialHistogram"
MetricTypeSummary -> "Summary"

2. For enums that are defined in proto, we have an inconsistent behavior.

One of the proto-defined enums uses the same approach as in (1):

MetricAggregationTemporalityUnspecified -> "Unspecified"
MetricAggregationTemporalityDelta -> "Delta"
MetricAggregationTemporalityCumulative -> "Cumulative"

While other enums (2) reuse the string value defined in proto which is always a full uppercased name:

StatusCodeUnset -> "STATUS_CODE_UNSET"
StatusCodeOk -> "STATUS_CODE_OK"
StatusCodeError -> "STATUS_CODE_ERROR"

We should bring all of the values returned by enum String() method to a consistent form.

There are seem to be several options how to achieve that:

A. Ignore string values defined in proto and use the short terms everywhere. Requires changes in plog.SeverityNumber.String(), ptrace.SpanKind.String(), and ptrace.StatusCode.String()

B. Use long version written in uppercase with underscores averywhere. Requires changes in pmetric.MetricType.String(), pmetric.NumberDataPointValueType.Sgtring, pmetric.ExemplarValueType.String(), and pmetric.ValueType.String()

C. Use values defined in proto for enums (2), but keep the short version for enums (1). Requires changes in pmetric.MetricAggregationTemporality.String() only

bogdandrutu commented 2 years ago

I do believe that we can have different behavior between proto defined values and the types. Does not mean we should, but it is acceptable if we decide that.

dmitryax commented 2 years ago

I do believe that we can have different behavior between proto defined values and the types. Does not mean we should, but it is acceptable if we decide that.

@bogdandrutu I also like this approach, but it'll be different from what json marshaler produces. So user will not be able to compare the json values with pdata unless we provide additional methods like Code returning the proto strings.

I think I'm more inclined towards sticking with what proto provides now for the category 2. Not sure it worth patching json marshaler either.

WDYT?

bogdandrutu commented 2 years ago

@bogdandrutu I also like this approach, but it'll be different from what json marshaler produces. So user will not be able to compare the json values with pdata unless we provide additional methods like Code returning the proto strings.

JSON marshaler per the specs MUST produce ints for the enums, see https://github.com/open-telemetry/opentelemetry-specification/pull/2758

dmitryax commented 2 years ago

@bogdandrutu I inspected all the components that currently use plog.SeverityNumber.String(), ptrace.SpanKind.String(), or ptrace.StatusCode.String() and got the following list separated by categories where the value is being used as:

Output payload:

Input payload:

Config options:

Given that we have a pretty big list, we need to be very careful if we go this direction. Some of the components potentially can use the updated string, but they have to go through some graceful transition process. In the meantime we would have to update all of them to keep existing behavior similar to how it's done in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/16019.

Another option would be to expose another method that returns a string representing proto enum constant name, so clients can keep using it instead of String without introducing any breaking changes. WDYT?

bogdandrutu commented 2 years ago

Another option would be to expose another method that returns a string representing proto enum constant name, so clients can keep using it instead of String without introducing any breaking changes. WDYT?

The problem is that current protocol may not guarantee stability on these strings, so I don't want give this stability from this repo if proto does not give that.

dmitryax commented 2 years ago

Sounds good. I submitted a PR in contrib that uses a replicated version of existing methods for now https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/16021