Closed tigrannajaryan closed 1 year ago
Apologies if this has been covered elsewhere, but can you help me understand the difference between a resource vs an entity, and whether or not these concepts are meant to be mutually exclusive? Is an entity a component that can be described by attributes and state, but doesn't qualify as a resource because it doesn't emit telemetry? Conversely, if an entity can emit telemetry, then why not model it as a resource?
These attributes appear to describe a thing that could be a resource:
otel.entity.id
otel.entity.type
otel.entity.attributes
(flattened into the resource attributes)We typically put multiple entities in a Resource. Here is some problems we have with the current Resource and why we want an Entity that is is defined differently from the Resource.
The Resource is defined as a representation of the entity.
A Resource is an immutable representation of the entity producing telemetry as Attributes.
Note it speaks about one particular entity. In practice we commingle multiple entities into one Resource. The spec shows a clear example that talks about multiple entities (Process, Container, Pod, etc) in one Resource:
For example, a process producing telemetry that is running in a container on Kubernetes has a Pod name, it is in a namespace and possibly is part of a Deployment which also has a name. All three of these attributes can be included in the Resource.
The problem with such usage is that by looking at the Resource attributes it is impossible to tell which of the represented entities is the entity, i.e. the entity which produced the telemetry.
The Resource is one set of attributes, which contains all attributes of all entities that the Resource represents. It is impossible to tell which of these attributes identify the entity (or entities) and which are non-identifying, i.e. purely descriptive.
This lack of precise identity makes it difficult or impossible to identify the same entities reported in different Resources.
Resource is defined to be immutable in the OpenTelemetry SDK. This does not align well with the fact that non-identifying attributes of entities may change over time. For example OpenTelemetry Collector collects data about Pods and adds Pod labels as Resource attributes. Pod labels are mutable in Kubernetes and can change over time, while the Pod's identity remains immutable. Here is another example where mutable Service attributes are desirable.
With the current definition of the Resource we are forced to either leave out any attributes that may ever change over time or violate the spec definition.
Additionally, OpenTelemetry currently lacks the ability to provide resource attributes that require some kind of delayed lookup that may fail (see this issue). This required, e.g. passing environment variables for k8s container name and various downward-api values for an OpenTelemetry SDK to appropriately report this resource.
In reality OpenTelemetry SDKs can also easily violate the definition as soon as we consider mutability from recipients perspective. SDKs only guarantee immutability during a single process session. As soon as the process is restarted and the SDK is newly initialized there is no guarantee that the Resource will have the same set of attributes (e.g. because process.id can be one of the Resource attributes).
It is clear that the strictly "immutable" definition of the Resource is not sufficient for what we are trying to model.
[Copied from Josh Suereth's description] Every attribute in an OpenTelemetry Resource, according to the metric datamodel, is used to determine the identity of metric. Given known issues in metric time-series database implementation around cardinality, this can cause major issues if Resources are allowed to leverage high cardinality attributes.
Given many Resource attributes semantic conventions today were defined for the tracing instrumentation, we do find many high cardinality definitions, e.g. the Process resource includes pid
and and parent_pid
, which are known to churn between instances of an application and would lead to higher cardinality streams.
Many metric backends are simply erasing resource attributes from metrics to workaround the issue. Here's an example solution for prometheus, and another proposal for yet another point-fix for prometheus.
However, these workarounds prevent Metrics users from regaining descriptive attributes (and benefits) of current OTEL Resource detection.
In case you've come across this looking for information about EntityState
and EntityDelete
events in OpenTelemetry, these are now described in the Entities Data Model, Part 1.
From design document: https://docs.google.com/document/d/1Tg18sIck3Nakxtd3TFFcIjrmRO_0GLMdHXylVqBQmJA/edit#heading=h.v4ilwdkncxe
Entity Events
EntityState
Indicates the entity's current state. Note that the state is cumulative, i.e. this event describes the full state of the entity as it is at a certain moment of time.
Field name: Timestamp
Type: Timestamp, uint64 nanosecods since Unix epoch
Description: The time since when the entity state is described by this event. When the entity's state is changed it is expected that the source will emit a new EntityState event with a fresh timestamp and full list of values of attributes and relationships. The time is measured by the origin clock. This field is optional, it may be missing if the time when the change happened is unknown.
Field name: Id
Type: key/value pair list
Description: Entity identifier. MUST not change during the lifetime of the entity. Can contain one or more key/value pairs. This field is required. If the list is empty the event is malformed and should be ignored.
All key/value pairs in the Id are also considered to be attributes of the entity. The key/value pairs respect OpenTelemetry semantic conventions for resources.
Note: in this phase 1 design the entity has only one Id (composed of one or more key/value pairs). We have also discussed in the past the ability for entities to have multiple Ids (each Id itself being a list of key/value pairs). The capability for entities to have multiple Ids is out of scope for phase 1 design.
Field name: Type
Type: string
Description: The type of the entity. MUST not change during the lifetime of the entity. This field is required. If the field missing or empty the event is malformed and should be ignored. Typically set equal to the prefix used by attributes of the semantic conventions for the particular concept in OpenTelemetry (e.g. "service" for Service, "k8s.pod" for Kubernetes Pod, etc).
Field name: Attributes
Type: key/value pair list
Description: Entity attributes. MAY change over the lifetime of the entity. The specified attribute values are effective starting from the time specified in the timestamp field. This field is optional. If it is missing or the list is empty then the entity has no attributes other than the ones contained in the id. The key/value pairs respect OpenTelemetry semantic conventions for resources.
EntityDelete
Indicates that an entity is deleted.
Field name: Timestamp
Type: Timestamp, uint64 nanosecods since Unix epoch
Description: Time when the entity is deleted measured by the origin clock. This field is optional, it may be missing if the timestamp is unknown.
Field name: Id
Type: key/value pair list
Description: entity identifier. Can contain one or more key/value pairs. This field is required. If the list is empty the event is malformed and should be ignored.
Mapping to Log Records
Entity events don't yet have a first-class representation in OpenTelemetry. However, they can be temporarily/experimentally mapped to Log records to allow us to work with entity data, do experiments and research/iterate on the concept of entities. This will allow to pass the entity events in the log pipeline and make them available to processors and exporters. This section defines how Entity events can be represented as Log records.
To improve processing efficiency of received batches the following Scope attribute must be set for all log records representing entity event: otel.entity.entity_event=true
EntityState Log record
EntityDelete Log record