thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
12.79k stars 2.04k forks source link

Enable Receiver to extract Tenant from a label present in incoming timeseries #7081

Open verejoel opened 6 months ago

verejoel commented 6 months ago

Is your proposal related to a problem?

One way to support a multi-tenant ruler is to enable receives to infer the tenant based not on HTTP headers, but on labels present in the incoming time-series. See the comment in issue #5133 for more information.

As well as helping move forward with the multi-tenant ruler, this would help enable multi-tenancy in clients that deliver telemetry. For example, currently if we want to ship telemetry using the OpenTelemetry collector prometheus remote write exporter, we would need to configure an exporter per tenant (I am aware that one can use the headers setter extension to dynamically set headers, but this only works if you have the same tenant for the whole request context).

Describe the solution you'd like

Introduce a new CLI flag for the receiver, --incoming-tenant-label-name. Thanos receiver will then search each time-series for occurrences of this label, building a map of unique tenant names discovered from the label values to slices of time series belonging to that tenant. Each element of the map can then be distributed according to the hashring config.

The process is summarised in this flow chart: image

Notes:

Describe alternatives you've considered

Performing the manipulations to the THANOS-TENANT header upstream dynamically (using an OTel collector, for example).

Additional context

We'd need to modify the behaviour of the receiveHTTP handler, in particular where we extract the tenant from the HTTP request.

fpetkovski commented 6 months ago

This would be a really cool feature indeed! We've tried to build a proxy that extracts the tenant from a label and sends one request per tenant with the appropriate header, but it overwhelmed receivers and was not worth the hassle. Having the feature natively built into Thanos is the way to go.

verejoel commented 6 months ago

Thanks Filip - do you think the proposal for how the feature would work makes sense? I’m a little worried to add too much overhead to the routing receives and cause requests to get backed up.

On Sat, 20 Jan 2024 at 10:49, Filip Petkovski @.***> wrote:

This would be a really cool feature indeed! We've tried to build a proxy that extracts the tenant from a label and sends one request per tenant with the appropriate header, but it overwhelmed receivers and was not worth the hassle. Having the feature natively built into Thanos is the way to go.

— Reply to this email directly, view it on GitHub https://github.com/thanos-io/thanos/issues/7081#issuecomment-1902049545, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALE5VOKEKSMDLZSCHTFBCLDYPOHLTAVCNFSM6AAAAABCCR5KMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBSGA2DSNJUGU . You are receiving this because you authored the thread.Message ID: @.***>

MichaHoffmann commented 5 months ago

Thanks Filip - do you think the proposal for how the feature would work makes sense? I’m a little worried to add too much overhead to the routing receives and cause requests to get backed up.

Wouldnt this be evaluated on the ingesting receiver when looking in which tenant the sample should be written to?

I would imagine that instead of one local write, we would inspect the request and group it by tenant and issue multiple local writes here https://github.com/thanos-io/thanos/blob/4a73fc3cdbc94ddce238c695f4723a9c0995f9da/pkg/receive/handler.go#L793, right? That would happen on the ingester I think!

fpetkovski commented 5 months ago

The ingesting receiver does not always have access to the hashring (e.g. in router-ingester split mode). So routers need to know which ingester to send samples to.

verejoel commented 5 months ago

Had some discussions with @MichaHoffmann. We came to conclusion that the proposed implementation would completely break the current rate limiting concept...e.g. if 1 tenant in a batch of 20 is over the limit, what should Thanos do? 429 will be retried and result in the whole batch being ingested again. If we drop all metrics in the batch due to 1 tenant being over the limit, we have created a noisy neighbour problem. But if we accept the metrics from the valid 19/20 tenants, then we will have out of order issues.

So the current design is incompatible with per-tenant rate limiting as it stands.

sepich commented 5 months ago

Another way to do it, is via current --receive.relabel-config in router stage by exposing tenant as internal label (ex. __meta_tenant_id) and allowing it's modifications:

- |
    --receive.relabel-config=
    - source_labels: [prometheus]
      target_label: __meta_tenant_id

The same way it is done in mimir: https://github.com/grafana/mimir/pull/4725

Example of changes needed in Thanos: https://github.com/thanos-io/thanos/commit/44a0728f03d9ed1812c4289caa36650c18abff27#diff-42c21b7b04cc61ab0cda17794cc1efff14802e0e89a85503d28601e721c1dd31R849

verejoel commented 5 months ago

@sepich I like that approach. Do you know how Mimir handles per-tenant limits in that situation?

MichaHoffmann commented 5 months ago

@verejoel I think it has the same issues with the ratelimit since all the samples still come from one remote write request!

GiedriusS commented 5 months ago

Implemented a PoC for this, works really well. A few caveats:

verejoel commented 3 months ago

@GiedriusS any chance you can open a PR for this issue? Would love to try it out.

GiedriusS commented 3 months ago

@verejoel check out https://github.com/thanos-io/thanos/pull/7256

pvlltvk commented 1 month ago

@GiedriusS Hi! I've tried this functionality in v0.35.0, but with no luck. Does it work in router + ingestor mode?

irizzant commented 1 month ago

I've tried --receive.split-tenant-label-name too but it's not working.

Thanos is deployed with Bitnami Helm Chart, here is the receive configuration:

- receive
            - --log.level=info
            - --log.format=logfmt
            - --grpc-address=0.0.0.0:10901
            - --http-address=0.0.0.0:10902
            - --remote-write.address=0.0.0.0:19291
            - --objstore.config=$(OBJSTORE_CONFIG)
            - --tsdb.path=/var/thanos/receive
            - --label=replica="$(NAME)"
            - --label=receive="true"
            - --tsdb.retention=15d
            - --receive.local-endpoint=127.0.0.1:10901
            - --receive.hashrings-file=/var/lib/thanos-receive/hashrings.json
            - --receive.replication-factor=1
            - --receive.split-tenant-label-name="tenant_id"

I created a ServiceMonitor which adds the label tenant_id=testbut metrics come out with tenant_id=default-tenant

MichaHoffmann commented 1 month ago

Tenant might be a reserved label iirc:

--receive.tenant-label-name="tenant_id" Label name through which the tenant will be announced.

irizzant commented 1 month ago

So, if I understand, it's impossible to use the --receive.tenant-label-name="tenant_id" because tenant_id is a reserved label?

irizzant commented 1 month ago

Also, Receive is adding a tenant label along with tenant_id in the tsdb metrics, is this expected?