juergen-kaiser-by commented 1 week ago

Component(s)

No response

Is your feature request related to a problem? Please describe.

We run an observability backend (Elasticsearch) shared by many teams and services (thousands). The services run in kubernetes clusters and we want to collect the logs of all pods.

Problem: If a service/pod becomes very noisy for some reason, it can burden the backend so much that all of other teams feel it. In short: One team can ruin the day for all others.

We would like to limit the effect a single instance or service can have on the observability backend.

Describe the solution you'd like

There should be a method to limit the flow of logs based on some atributes. We apply a standardized set of labels to each Kubernetes deployment, so we would work with those.
Dropping log lines would be okay (for us) since we can not assume that a noisy service stops being noisy soon. Sampling is not okay because of the nature of logs.
If rate limiting happens then we and our users should be able to see it.
- A single metric ingested into some (other) pipeline would suffice.
- Let us choose a set of attributes the metric should have (copied) from the rate limited logs so that we can map it to the affected service. In our case, we would like to have our standard pod labels copied to the metric.

Considering the points above, we think that there should be a processor for this.

We have no requirements regarding the algorithm backing the rate limiting. It seems that a token bucket filter (example blog entry) is a reasonable choice here.

Describe alternatives you've considered

Rate Limiter in Receivers

Receiver rate limiting is okay if you only work with attributes available in the receivers. In our case, those are insufficient because we need pod labels. As a workaround, we could inject them into the collector config as env variables and focus the collector on a single pod by deploying it as a sidecar. However, a sidecar deployment consumes too much resources across all pods because we have large clusters (>= 10K pods).

A benefit of a rate limiting in receivers could give collector users to choose between dropping incoming telemetry and just not receive it, effectively creating backpressure.

Rate limiting in Receivers is discussed in #6908.

Rate Limiter as Extension

We lack knowledge about how extensions work internally to say anything about it. Rate limiting as extensions also is discussed in #6908.

Additional context

rate limiter of filebeat
Issue for rate limiting in receivers: #6908
This partially has been discussed in #29410
Official docs for Filelog Receiver in Kubernetes: link

atoulme commented 5 days ago

Can you explain why sampling doesn't work for you here? What would the configuration of the processor look like?

juergen-kaiser-by commented 5 days ago

The sampling existing today does not work because it always is enabled. We would like to only limit too noisy pods/services. Note that a service suddenly can become noisy, so we do not know the beforehand.

Once noisyness is detected, there are different options how to limit the flow. Whether we need sampling or some other rate limiting algorithm boils down to how useful the output is afterwards. For logs, we think that larger chunks of consecutive log lines are more useful than smaller ones (think about a samples stack trace vs. having a full one), therefore the limiting algorithm should create large once before starting to drop log lines. A simple sampling creates small chunks.

That being said, I do not see why it should not be possible to specify different algorithms in the long run. For the other telemetry types, other algorithms are better.

I do not have a good example configuration, yet. Let me think about that.

juergen-kaiser-by commented 5 days ago

As a first draft:

#[...]

processors:
  ratelimit:

    # defines the rate limiting for log signals. Idea for the structure is borrowed from the transformprocessor. We can have own sections for log, metric, and trace if we want this processor to be generic to all types.
    log:

      # grouping defines wheter and how to group the telemetry by a set of attributes.
      grouping:

        #   mode defines how to deal with logs/metrics/traces not not falling into one of the defined groups (not having the fields)
        #   - strict (default): nonexisting fields are treated as if they do (having a default value) => always rate limit. Fall back to the last group if existent.
        #   - relaxed: rate limiting applies only if all fields are present
        mode: strict

        groups:
        - tbf: # used rate limiting algorithm. Others are possible but only one must be set. (tbf = token bucket filter)
            average_rate_per_sec: 1000
            max_burst_per_sec: 5000

          # defines the list of attributes to use for grouping.
          attributes:
          - context: resource
            attribute: k8s.pod.uid
          - context: resource
            attribute: k8s.pod.tag.my-project-name

#[...]

The structures allows user to:

define rate limiting for different signal types
define different rate limiting strategies
define multiple groups

We could also add conditions to the group attributes to enable users to define more specialized groups.

open-telemetry / opentelemetry-collector-contrib

Rate Limit Processor #35204