Open jsuereth opened 2 years ago
If we follow the previous work for Prometheus, then we can leverage the fact that OpenTelemetry metrics relies on service Resource attributes for identity:
service.name
- currently required for all OTEL resourcesservice.instance.id
- currently only recommended for OTEL resources.These attributes become required and identifying for all resources. All other attributes are considered "descriptive" and may be appended to a resource at any time.
Specifically, this enables the following:
we can leverage the fact that OpenTelemetry metrics relies on service Resource attributes for identity:
service.name
- currently required for all OTEL resourcesservice.instance.id
- currently only recommended for OTEL resources.These attributes become required and identifying for all resources. All other attributes are considered "descriptive" and may be appended to a resource at any time.
This is only sufficient for Resources which are Services. I think we can say that Resources that are emitted using Otel SDK are Services (augmented with additional useful non-Service attributes).
However, not all Resources that we can imagine in Otel are Services. For example the Collector emits metrics about a Host or about a Process. The Host and the Process are the Resource in this case. There is no Service to associate this data with. There are many other such examples possible.
I think we need to generalize this to the following:
service.name
and service.instance.id
.Given the above, I believe we can say the following: the identify of a metric timeseries must include the minimal set of the identifying attributes of its Resource plus the identifying attributes of the entities in the context of which the identity is defined. So, for example for a metric of a Service both (service.name, service.instance.id) attributes and service.namespace
(if present) are part of the timeseries identity.
I think the consequence of this is that the definition of a telemetry emitting Resource must include not only its identifying attributes but also reference the containing Resource in the context of which the identity is defined.
As for the cardinality of some identifiers. I think it is the reality of telemetry data that some Resources have naturally high cardinality. It is their important inherent property. I do not want us to be perpetually locked in a world where technical limitations of some technologies dictate our understanding of what comprises the identity of the object of interest and force us to drop vital information about the object right at the source.
For example, when emitting metrics about the Process I don't think we should omit the process id. Whether it is high cardinality or no, it is the only single attribute that correctly and uniquely identifies the Process (within its Host). If a certain destination has trouble dealing with the cardinality of the attribute either it is that particular destination's problem they need to deal with or if we believe it is our problem then we can think about how to record the knowledge about expected cardinality of attributes and use it for lossy transformation in the exporter targeting that particular destination while retaining full fidelity when sending data over OTLP to destinations which do not have such cardinality limitations. Perhaps this means that high-cardinality is one of the labels that we associate with a particular attribute in the semantic conventions, record the label in the Schema files, and use the Schema file for automatic cardinality reduction in the exporters.
Specifically, this enables the following:
- Updating / Appending attributes to telemetry in the Collector. We can consider downstream refinement "more accurate".
- Allow late-arriving resource attributes. Backends can merge descriptive attributes at any time as long as identifying attributes remain.
- Allowing descriptive resource attributes to be sent via a separate channel (e.g. prometheus _info metrics).
- Clear join semantics for x-signal correlation at the resource level (e.g. if attempting to query across both datasources in a SQL database, what keys would you use?)
+1 on all of this with one caveat. The non-identifying attributes of the Resource may change over time. This seems to contradict with our current definition of a Resource, which says it is immutable. However, I don't think there is a contradiction. We need to revise the definition to the following: the Resource emitted by an Otel SDK is immutable within the lifetime of the SDK runtime. Outside of that the Resource's non-identifying attributes can certainly mutate and there is no reasonable mechanism in Otel SDK that can prevent that so we may as well embrace it.
An example of mutating non-identifying attribute is the following: a Kubernetes Pod is unique identified by (k8s.cluster.name, k8s.namespace.name, k8s.pod.name) triplet. The Pod may also have multiple labels associated with it, which can be recorded as non-identifying attributes. In Kubernetes world these labels can change over time and so as a consequence can change the non-identifying attributes that record the labels.
Given that non-identifying attribute of a Resource may mutate over time, I think we need to be careful with saying that "Backends can merge descriptive attributes at any time". This is certainly doable as long as we understand the values of merged attribute can change overtime, it should be considered as immutable and no such consideration should be made when querying the stored data.
Tigran - You have a lot of good thoughts here so I'm going to respond to things in individual comments (hopefully easier to reference).
As for the cardinality of some identifiers. I think it is the reality of telemetry data that some Resources have naturally high cardinality. It is their important inherent property. I do not want us to be perpetually locked in a world where technical limitations of some technologies dictate our understanding of what comprises the identity of the object of interest and force us to drop vital information about the object right at the source.
For example, when emitting metrics about the Process I don't think we should omit the process id. Whether it is high cardinality or no, it is the only single attribute that correctly and uniquely identifies the Process (within its Host). If a certain destination has trouble dealing with the cardinality of the attribute either it is that particular destination's problem they need to deal with or if we believe it is our problem then we can think about how to record the knowledge about expected cardinality of attributes and use it for lossy transformation in the exporter targeting that particular destination while retaining full fidelity when sending data over OTLP to destinations which do not have such cardinality limitations. Perhaps this means that high-cardinality is one of the labels that we associate with a particular attribute in the semantic conventions, record the label in the Schema files, and use the Schema file for automatic cardinality reduction in the exporters.
This is actually where the rub is. Emitting process metrics with an attirbute of PID is fine. What I'm objecting to us is having PID be an attribute of every metric an SDK produces because that SDK happens to be a process resource. The difference here is nuanced but important.
TODAY, even http latency will have the process command line + PID attached given the way we do resources. Does that make sense from a metric standpoint? I think this data should be joinable with the actual timeseries, but it doesn't need to be a label on that timeseries itself. (e.g. see kubestate metrics in practice). The key here is we should limit cardinality to what's necessary. The example of process-metrics, PID is necessary, and beneficial. I'd argue having PID on HTTP latency metrics, not so much.
The key distinction I want to make here is resource attributes are appended to ALL metrics not just ones where that nuance is important.
I think fundamentally the issue here is that Spans/Logs are inherently an "(sampled) event" model, while Metrics are an "aggregated event" model. I suspect the profiling signal is likely to run into similar concerns/issues for their aggregated events (although profiling will have both aggregated + sampled use cases combined).
However, not all Resources that we can imagine in Otel are Services. For example the Collector emits metrics about a Host or about a Process. The Host and the Process are the Resource in this case. There is no Service to associate this data with. There are many other such examples possible.
This is a great point. However, we see in Prometheus, Jaeger, DataDog,etc. a "simplification" where these Resources are "forced" (or coerced) to belong to a service for simplicity in representation and understanding. You're right it's not a very accurate model, but is it more useful than something more nuanced? Perhaps. I don't think we should dismiss this simplification just yet though. Remember all models are lies, but some are useful
At minimum though, if I read your agreement:
I'd like to hear your thoughts on keeping a simplified Resource model (I'm challenging my own thought with it given what I've seen from Jaeger, Prometheus + DataDog, e.g.), but additionally I have some alternative thoughts/designs for Resource around identifying attributes, kinds + how to make progress there.
Given that non-identifying attribute of a Resource may mutate over time, I think we need to be careful with saying that "Backends can merge descriptive attributes at any time". This is certainly doable as long as we understand the values of merged attribute can change overtime, it should be considered as immutable and no such consideration should be made when querying the stored data.
Yeah. I'm actually thinking of splitting resource into a few components as a way to make non-breaking changes to it in the current form.
More specifically:
// Resource information.
message Resource {
// Defines the core type of this resource, e.g k8s_pod, etc.
// A resource kind defines the minimal set of key attributes that can uniquely identify the resource.
// If empty, the resource is assumed to be "generic_resource".
// Note: rather than resource_kind, this could ALSO be a telemetry-schema reference for the resource type.
string resource_kind = 3;
// Set of attributes that identify the resource.
// Attribute keys MUST be unique (it is not allowed to have more than one
// attribute with the same key).
repeated opentelemetry.proto.common.v1.KeyValue attributes = 1;
// Set of attributes that describe the resource.
// Attribute keys MUST be unique (it is not allowed to have more than one
// attribute with the same key).
repeated opentelemetry.proto.common.v1.KeyValue descriptive_attributes = 4;
// dropped_attributes_count is the number of dropped descriptive attributes. If the value is 0, then
// no attributes were dropped.
// Note: No identifying attributes are allowed to be dropped.
uint32 dropped_attributes_count = 2;
}
This is actually where the rub is. Emitting process metrics with an attirbute of PID is fine. What I'm objecting to us is having PID be an attribute of every metric an SDK produces because that SDK happens to be a process resource. The difference here is nuanced but important.
I believe what you say is exactly correct. In my view every metric is emitted for a particular entity. Metrics don't exist in vacuum. They are attached to some entity that we can pinpoint to (this is a simplification that I can argue against as well, but bear with me for now). IMO the identity of metric timeseries is defined as a combination of (metric datapoint dimensions + minimal globally unique identifier of the emitting entity).
So, when emitting metrics from Otel SDK the emitting entity is the Service. Thus only Service's uniquely identifying attributes should be part of the metric's timeseries. Yes, the Service is also associated with a Process. This is important for other purposes, but is irrelevant from timeseries identification purposes.
TODAY, even http latency will have the process command line + PID attached given the way we do resources. Does that make sense from a metric standpoint?
No, I think it doesn't. See my previous comment to why I think so.
I think this data should be joinable with the actual timeseries, but it doesn't need to be a label on that timeseries itself. (e.g. see kubestate metrics in practice).
I assume by "joinable" you mean the following: if we know the Process's full identity (e.g. by knowing its process id, host name, etc) there must be a way to locate the timeseries of all metrics that are emitted by a Service Instance that is associated with that Process. Notice there is more than one level of indirection here - we must find the Service Instance from the Process, then find the timeseries from the Service Instance's attributes. If this is what you mean then I definitely agree.
The key here is we should limit cardinality to what's necessary. The example of process-metrics, PID is necessary, and beneficial. I'd argue having PID on HTTP latency metrics, not so much.
Agreed.
At minimum though, if I read your agreement:
- Resource need a notion of identifying attributes
Yes.
- For any given resource there should be a minimal set of identifying attributes that are clearly delineated.
Yes.
- We should not allow identifying attributes to be changed. Descriptive attributes being added later is ok (I didn't see any complaints about this).
Yes.
- There's possibly a notion of "kind" for a Resource that implies its required/identifying attributes (and likely interacts with the merge method).
Possibly. I am not yet sure about this. I have some possible different use-case for the "kind" in my mind, but not yet sure about it. However, I would like to understand the use case that you describe. How do you intend the "kind" to interact with the merging logic?
There is likely a few more nuances that we haven't discussed yet.
The most important unresolved problem for me is: what constitutes the globally unique identifier of a Resource? I can define a locally unique identifiers easily. For example the process id is a locally unique identifier for a Process in the context of a Host where the process runs. However, the process id is not sufficient as a global identifier. We need something more there, in particular we need to in addition know the context within which the local identifier was recorded (i.e. the identity of the Host in my example).
I am not sure how exactly we solve this. The globally unique identifier of a Resource depends on the environment where it runs. There is no single definition of the global identifier for each individual Resource that we are interested in.
I will try to allocate some time to this and see if I can put together an OTEP that addresses all these questions.
- Creating a separate location to store descriptive attributes. For example:
// Resource information. message Resource { ... }
For backward compatibility reasons we cannot change the semantics of the current attributes
field. So, we will probably need to do something slightly different, e.g.:
// Resource information.
message Resource {
// Set of attributes that describe the resource.
repeated opentelemetry.proto.common.v1.KeyValue attributes = 1;
// dropped_attributes_count is the number of dropped attributes. If the value is 0, then
// no attributes were dropped.
uint32 dropped_attributes_count = 2;
// Defines the core type of this resource, e.g k8s_pod, etc.
string resource_kind = 3;
// Set of attributes that identify the resource. Note that the identifying attributes may be
// also recorded in the "attributes" field.
repeated opentelemetry.proto.common.v1.KeyValue id = 4;
}
This definition is backward compatible, so we won't break OTLP.
However, I am not happy with the definition of the id
above. I think it is not sufficient because of the local/global identification problem I described above. I need more time to think about this.
For backward compatibility reasons we cannot change the semantics of the current attributes field.
I actually wasn't changing the semantics. Today, by specification, all attributes in the Resource are descriptive and identifying. What I was proposing was a forward+backward compatible change (older users don't need to change their notion of identity or know to look for new fields).
Specifically, if you look at use cases:
Old Telemetry:
Resource { attributes=[host=127.0.0.1, pid=1234, service.name=my_service] }
By specification, we treat all of those as identifying when we merge together telemetry by-resource.
New Telemetry:
Resource {
attributes=[host=127.0.0.1, pid=1234, service.name=my_service]
descriptive_attributes=[my_added_thing=my_value]
}
Here, an old client would still join together streams by attributes and be correct about it.
Unfortunately today we allow to (and do) record all sorts of junk information in the Resource. Any assumption that current Resource attributes are identifying is incorrect and any recipient that makes this assumption is generally speaking broken.
By specification, we treat all of those as identifying when we merge together telemetry by-resource.
I think this is broken today. It probably works in some cases. The best we can do is to leave it alone so that it continues working the way it does.
My suggestion is that we should continue recording what we record today in the Resource's attributes
field. We should not change this behavior so that the existing recipients continue working in the same (broken) way.
In the new telemetry we introduce an additional field id
which is strictly defined to only include identifying attributes. So, using your example you will see this:
Old Telemetry:
Resource { attributes=[host=127.0.0.1, pid=1234, service.name=my_service, my_added_thing=my_value] }
New Telemetry:
Resource {
attributes=[host=127.0.0.1, pid=1234, service.name=my_service, my_added_thing=my_value, my_custom_env_var=blah]
id=[service.name=my_service]
}
Here the old recipients are oblivious to the existence of the id
field and continue using the same old attributes
in the same old broken way.
New recipients are now wise enough to know that the id
field should be used for identification. New metric exporters which see such telemetry know that they should only include id
field's attributes in the timeseries.
So I think we both agree in general on the shape of the solution, just not specifics or what will be more annoying to fix going forward. I've been focusing on the way we bundle-by Resource
in OTLP, you've been focused on preserving all attributes the exist in Resource today in the same spot. Let's table that discussion for now and make sure this notion of identifying attributes has legs. Specifically, we agree on a split between identifying + descriptive attributes (at a minimum).
Specifically, we still have the major issues of:
Resource.merge
logic changes
service.name
is required across ALL otel resources due to jaeger/zipkin compatibility.Specifically, we agree on a split between identifying + descriptive attributes (at a minimum).
Agreed. (BTW, @bogdandrutu suggested to consider an alternate where we only have a single identifier string, instead of a set of attributes. We can consider that, but I think in the context of the current discussion the difference is not yet important).
How do we know which set of identifying attributes to preserve during merge?
Yes and in addition to this: which identifying attributes become descriptive when merging? For example if I merge a Process with a Host then both process.id
and host.id
become identifying attributes of the Process (on the particular Host). However, when I merge a Service Instance with a Process the service.name
and service.instance.id
remain Service Instance's identifying attributes while process.id
becomes a descriptive attribute of the Service Instance (because we do not want the identity of the Service Instance to change when the Process is restarted).
Resource schemaUrl - Does this just apply to identifying attributes?
No, I think it applies to all attributes, both identifying and non-identifying.
Exporter rules/conventions around descriptive + identifying attributes.
Yes. In addition - conventions on which identifying attributes can be eliminated/aggregated to reduce cardinality when exporting to low cardinality formats/destinations.
Today service.name is required across ALL otel resources due to jaeger/zipkin compatibility.
If we accept this rule we will need to have guidelines about how to compute service.name
for things that are really not services. For example a Host may have system metrics. I clearly want to be able to collect such metrics as telemetry and the Host is the Resource. What's service.name
in this case?
Also, AFAIK, Collector does not necessarily follow this rule rigorously. We will need to double check.
Resource schemaUrl - Does this just apply to identifying attributes?
No, I think it applies to all attributes, both identifying and non-identifying.
This implies, then, that ALL resources (that can merge) need to be part of the same schema, which basically means ALL resources must be fully specified by OTEL semantic conventions. Is this what you intend? E.g. where is the room for user-generated descriptive attributes?
Today service.name is required across ALL otel resources due to jaeger/zipkin compatibility.
If we accept this rule we will need to have guidelines about how to compute service.name for things that are really not services. For example a Host may have system metrics. I clearly want to be able to collect such metrics as telemetry and the Host is the Resource. What's service.name in this case?
Also, AFAIK, Collector does not necessarily follow this rule rigorously. We will need to double check.
Yeah, I know the collector does not follow this and it's been the source of some friction.
SO I think our major disagreements are on:
Otherwise, we're both looking to solve the same problem. I think this means we're ready to start exploring the solution space. I'll try to put together a proposal this week and link it here for review.
This implies, then, that ALL resources (that can merge) need to be part of the same schema, which basically means ALL resources must be fully specified by OTEL semantic conventions. Is this what you intend? E.g. where is the room for user-generated descriptive attributes?
No, that is not my intent. The presence of schema_url in the Resource does not mean that all Resource attributes belong to that schema. Only the ones that have a matching name defined by that schema.
User-generated attributes are of course possible. We already have guidelines about application-specific and company-specific attribute naming here. Because these are named differently they naturally are not covered by Otel schema and essentially are exempt from schema-related functionality.
Note that if in the future we introduce Derived Schemas it will be possible to have a schema_url which refers to an application-specific or company-specific Otel-derived Schema, such that even application-specific and company-specific attributes are schema-compliant, potentially allow to subject all Resource attributes to schema rules.
However, I do not think it is a mandatory requirement to support Derived Schemas or to ensure that all attributes are covered by Otel Schema.
Do you see any problems with this approach? To me this does not look like a major disagreement area. I think we agree more than disagree :-)
I think my view on schema-url is slightly different than yours, so we probably need to have that conversation elsewhere. Specifically, the notion of Dervied Schema or company-supported schema. In practice we don't allow Resource.merge to work for different schema-urls unless one is blank. We need to directly address that verbage. Perhaps we both agree we need to address it, and you plan to do so through expansion on Schemas themselves whereas I'd like to better denote how Schema + Resource interact (especially since I don't think schema_url exists on resource but beside resource in OTLP).
@jsuereth @tigrannajaryan are you planning to take this issue forward anytime soon? When you discuss on a SIG meeting I would like to join in the conversation and explore the possibility of introducing one more collection of attributes in Resource
, say varying_attributes
, which are a kind of descriptive attributes but are also modifiable. This is to address the requirement in https://github.com/open-telemetry/oteps/pull/208
For reference: OTEP https://github.com/open-telemetry/community/pull/1976 aims to address this.
I've asked a related question on Stack Overflow: "Do the OpenTelemetry spec/docs recommend/require OTLP consumer behavior for attribute names that are not defined in the semantic conventions?".
That question is related to this issue because it refers to the different ways in which OTLP consumers ("telemetry backends") handle "unrecognized" resource attributes and metric data point attributes. Although I don't specifically mention it in that question, a workaround to getting some telemetry backends to ingest unrecognized attributes is to characterize them as metric data point attributes. (Sorry, I don't want to name specific backends. Neither, at this time, do I want to elaborate on the specific nature of these attributes, other than to say they're not in the current registry or semantic conventions.)
With apologies if referring to a question on an external site is poor etiquette (I'd like to get informed eyeballs on that question, but I don't want to annoy anyone), I invite anyone interested in this issue to read, and answer, that question.
The Metric Cardinality Problem
Today, every attribute in an OpenTelemetry Resource, according to the metric datamodel, is used to determine the identity of metric. Given known issues in metric time-series database implementation around cardinality, this can cause major issues if Resources are allowed to leverage high cardinality attributes.
Given many Resource attributes semantic conventions today were defined for the tracing instrumentation, we do find many high cardinality definitions, e.g. the Process resource includes
pid
and andparent_pid
, which are known to churn between instances of an application and would lead to higher cardinality streams.Many metric backends are simply erasing resource attributes from metrics to workaround the issue. Here's an example solution for prometheus, and another proposal for yet another point-fix for prometheus.
However, these workarounds prevent Metrics users from regaining descriptive attributes (and benefits) of current OTEL Resource detection.
The Resource identity problem
It turns out that the identity problem isn't isolated to metrics alone. It also impacts our ability to work with Resources within OpenTelemetry. For example, these bugs
1620, #1357, #1521 Highlights issues around merging environment provided descriptive attributes w/ auto-detected resource attributes and the currently defined merge method. Effectively all three of these boil down to: "How can users manipulate resource attributes for their own labels and use cases"? Specficallly
1298 Highlights concerns around attaching new information around identity post-startup. Some resource attributes are unable to be determined prior to actual startup of an application.
2500 Is looking at defining a resource as a single user-session. While this one is likely more indicative of a lack of baggage-related built-in features, it also teases into issues of descriptive vs. identifying features of Resource and the current need for Resource to be fast-to-compute, consistent and done at startup.
Rather than isolating point-fixes as they arise (e.g. the prometheus exporter case), it's become clear we need a better hollistic solution.