open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.98k stars 2.31k forks source link

Add receiver_creator "properties" to enable receiver config definitions in observer.Endpoint metadata #17418

Open rmfitzpatrick opened 1 year ago

rmfitzpatrick commented 1 year ago

Component(s)

receiver/receivercreator

Is your feature request related to a problem? Please describe.

Currently the receiver_creator requires all possible receiver instances to originate from templates defined in collector service configs, with dynamic config options being limited to endpoint-derived values obtained from observer.EndpointEnv content in expr rule evaluation. For example a rule of type == "container" and labels['some.label'] == 'some.value' allows label content evaluation but only from a hardcoded receiver template config entry. This is a strong limitation to end users in that the collector service configuration needs to be updated for all new receiver types (though multiple config files/sections can improve this in practice).

Describe the solution you'd like

To better support multiuser environments like Kubernetes and reduce general configuration maintenance requirements for the Receiver Creator, I'd like to propose a new notion of "properties" in observer endpoint metadata stores (container/pod labels and annotations) that the receiver will incorporate in the evaluation of receiver templates (creating ones not expressly defined in the receiver's config):

docker run \
-l io.opentelemetry.receiver.apache/my-receiver.rule='type == "container"' \
-l io.opentelemetry.receiver.apache/my-receiver.config::metrics::apache.scoreboard::enabled=false \
-l io.opentelemetry.receiver.apache/my-receiver.resource_attributes.some-resource-attribute=attribute.value \
httpd

This way, an apache receiver named my-receiver will be instantiated for just the resulting docker observer's container endpoint using a collector config containing only:

extensions:
  docker_observer:

receivers:
  receiver_creator:
    watch_observers: [docker_observer]

If the creator config already contained an apache/my-receiver, its content would be updated just when evaluating against the above container's env, remaining unchanged for all other env evaluation.

This feature would alleviate Collector config maintenance requirements and allow service owners to "automatically" enable metric collection for their target systems using platform metadata mechanisms alone.

My initial stab at a property format that allows config, rule, and resource attribute declarations/updates takes the form

<1><2>.<3>(.<4>): <5>
  1. Reverse DNS or namespace prefix:
    • io.opentelemetry.receiver.
    • receiver.opentelemetry.io/
  2. Receiver type (and name)
  3. Property type corresponding to entry template or config field (config, rule, or resource_attributes)
  4. Key:
    • confmap usable key/field for config
    • Resource attribute key for resource_attributes
  5. String value (accepting yaml scalar, list, or object for config)

Since the receiver template creation/modifications for a given endpoint will only use literal values (that could contain backticked expr rules), this feature wouldn't satisfy requests like https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/13744 that require more capable config/value loading, but ideally observer enhancements would be able to over time making properties more expressive. Config provider directives scheme:uri would also not be supported since the service config provider isn't made available to individual components.

Describe alternatives you've considered

Using multiple configs in a single receiver creator block, but this still requires general collector service maintenance and access. Any incidental issues could halt the service overall as well instead of being more isolated (logged and ignored during endpoint evaluation alone).

Additional context

prior art: This feature has an analog in the SignalFx Smart Agent, which I believe was a model for the Receiver Creator and observers: https://github.com/signalfx/signalfx-agent/blob/4eec53775bd3cf05fd6def318e81ff3c91963f2c/docs/observers/docker.md#configuration-from-labels

github-actions[bot] commented 1 year ago

Pinging code owners for receiver/receivercreator: @jrcamp. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

rmfitzpatrick commented 1 year ago

I've opened #25866 to provide this functionality as "endpoint properties." The overall aim of the effort is to provide complete receiver creator receiver template configurability to end user deployment practices without requiring any collector service config modification other than opting in with accept_endpoint_properties: true. These properties are to be evaluated only for endpoints that contain them to limit the range of action at a distance (where one endpoint would influence the scraping of another).

While developing it I found it necessary to change the property schema to better support kubernetes qualified names, which are fairly restrictive.

Annotations are key/value pairs. Valid annotation keys have two segments: an optional prefix and name, separated by a slash (/). The name segment is required and must be 63 characters or less, beginning and ending with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), underscores (_), dots (.), and alphanumerics between. The prefix is optional. If specified, the prefix must be a DNS subdomain: a series of DNS labels separated by dots (.), not longer than 253 characters in total, followed by a slash (/).

If the prefix is omitted, the annotation Key is presumed to be private to the user. Automated system components (e.g. kube-scheduler, kube-controller-manager, kube-apiserver, kubectl, or other third-party automation) which add annotations to end-user objects must specify a prefix.

Docker labels also have suggested conventions, though are unenforced: https://docs.docker.com/config/labels-custom-metadata/#key-format-recommendations

A label key is the left-hand side of the key-value pair. Keys are alphanumeric strings which may contain periods (.) and hyphens (-). Most Docker users use images created by other organizations, and the following guidelines help to prevent inadvertent duplication of labels across objects, especially if you plan to use labels as a mechanism for automation.

Authors of third-party tools should prefix each label key with the reverse DNS notation of a domain they own, such as com.example.some-label.

The collector doesn't currently have a containerd observer but their label validator appears to just require the total key+value length to be <4096.

These restrictions and suggestions in mind I propose that we have two supported prefixes to fit with the metadata carriers idioms.

receiver-creator.collector.opentelemetry.io/ for kubernetes annotations and io.opentelemetry.collector.receiver-creator. for container labels.

Endpoint property schema is <Prefix><Property Key>: <PropertyValue>.

The <Property Key> contains at least a receiver id type, and optionally a name distinguished by / for container labels and __ for pod annotations (since / is forbidden in k8s annotation keys). Being able to reference existing service-config receiver template entries and creating unique ones per-endpoint were the aims here:

receiver.type, receiver.type/receiver.name, receiver.type__receiver.name

It can also contain an optional "mapping key" to signify the receiver template field to set/update: config, resource_attributes, or rule. The mapping key-less form (just a receiver template component id) takes a yaml mapping value that can contain any combination of mapping keys.

The ::-separated mapstructure form is also supported in values, and where allowed by the metadata carrier (docker/containerd), the key.

Current examples from the PR:

apiVersion: v1
kind: Pod
metadata:
  name: redis
  annotations:
    receiver-creator.collector.opentelemetry.io/redis: |
      rule: type == "port" and port == 6379
      config:
        collection_interval: 5s
      resource_attributes:
        some_attribute: attribute_value
spec:
  containers:
  - name: redis
    image: redis
    ports:
    - containerPort: 6379
docker run \
  -l io.opentelemetry.collector.receiver-creator.prometheus_simple/bitnami.config.collection_interval=30s \
  -l io.opentelemetry.collector.receiver-creator.prometheus_simple/bitnami.config.labels::a.label=label_value \
  -l io.opentelemetry.collector.receiver-creator.prometheus_simple/bitnami.resource_attributes.some_attribute=attribute_value \
  -l io.opentelemetry.collector.receiver-creator.prometheus_simple/bitnami.rule='type == "container" and port == 9090' \
  bitnami/prometheus

Any feedback/suggestions appreciated.

hughesjj commented 1 year ago

So far no objections to syntax or proposed restrictions on the property schema + prefixing of them

github-actions[bot] commented 11 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 8 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 6 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 4 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

ChrsMark commented 2 weeks ago

Hey @rmfitzpatrick , I see the PR that tried to solve this was closed due to inactivity.

I had came with a similar proposal at https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34427 and only now I realize we already had this issue for this feature. Apologies for the duplication :).

I believe this feature would be quite useful for the Collector as I also explained at https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34427. It's a feature that is widely used byFilebeat's/Metricbeat's and Fluentbit's (even other agents) users. This was also discussed at https://cloud-native.slack.com/archives/C01N5UCHTEH/p1718893771362289.

Comparing the 2 proposals I understand that you suggest users will provide the full configuration in a single annotation while what I had in mind is to provide support for specifying specific settings as annotations for ease of use. For example:

apiVersion: v1
kind: Pod
metadata:
  name: annotations-hints-demo
  annotations:
    otel.hints/receiver: redis/1
    otel.hints/endpoint: '`endpoint`:6379'
spec:
  containers:
  ...

Of course we can provide additional support for specifying the full configuration with a raw annotation. I have seen something similar in Filebeat's raw annotation option.

So based on this I assume the 2 proposals could get merged into one. Any plans on resuming this work? I would be happy to review or even allocate the time to take this over. FYI @dmitryax since you were involved in the PR's review.