grafana / helm-charts

Apache License 2.0
1.54k stars 2.21k forks source link

[Tempo-distributed] - Unclear override configuation for tempo-distributed #3171

Open Nierhoff opened 1 month ago

Nierhoff commented 1 month ago

Describe the bug I am trying to increase the rate limits in tempo, deployed using helm and the chart tempo-distributed and has run into issues understanding the documentation.

To Reproduce Steps to reproduce the behavior: Deployed using command: helm install tempo-distributed grafana/tempo-distributed --namespace tempo --create-namespace --values distributed.yaml

the value file "distributed.yaml" content:

storage:
  trace:
    backend: s3
    s3:
      bucket: "tempo-traces"
      endpoint: minio.minio.svc.cluster.local:80
      access_key: tempo
      secret_key: REPLACE
      forcepathstyle: true
      insecure: true

global_overrides:
  per_tenant_override_config: /runtime-config/overrides.yaml
overrides: |-
  overrides:
    defaults:
      ingestion:
        burst_size_bytes: 20000000 # 20MB
        rate_limit_bytes: 20000000 # 20MB
        max_traces_per_user: 10000
        ingestion_burst_size_bytes: 20000000 # 20MB
        ingestion_rate_limit_bytes: 20000000 # 20MB
      global:
        max_bytes_per_trace: 20000000 # 20MB
    "*":
      ingestion:
        burst_size_bytes: 20000000 # 20MB
        rate_limit_bytes: 20000000 # 20MB
        max_traces_per_user: 10000
        ingestion_burst_size_bytes: 20000000 # 20MB
        ingestion_rate_limit_bytes: 20000000 # 20MB
      global:
        max_bytes_per_trace: 20000000 # 20MB

ingester:
   replicas: 1

traces:
  otlp:
    http:
      enabled: true
    grpc:
      enabled: true

I have seen multiple errors: the current is: "field ingestion not found in type overrides.LegacyOverrides"

The question is

  1. How should "global_overrides" be used ?
  2. what is the correct format for "overrides:" ?

I have been trying to use this as a guide https://grafana.com/docs/tempo/latest/configuration/#runtime-overrides

Expected behavior I am sure this is me not fully understanding the documentation. A single example for a value file for my version (1.9.9) would be much appreciated, where typical rate limit configuation is done. It might be related to https://github.com/grafana/helm-charts/issues/2802

Environment:

Link to issue in helm-chart repo: https://github.com/grafana/helm-charts/issues/3134

pantuza commented 3 weeks ago

There is a bug on the distributed-tempo Helm Chart while parsing this new overrides block. I have added the following comment on a issue describing how to workaround this problem.

Tempo itself is working as expected. If your final yaml file after Helm compilation ends up with a overrides block with proper content, it will run correctly. The issue is that the Helm Chart isn't generating the overrides block properly.

Also, in your specific case of this message (field ingestion not found in type overrides.LegacyOverrides), Tempo is saying that the program wasn't able to parse your overrides configuration and then falls back to Legacy Overrides configuration. It happens right here in the code. Maybe it can help you out while troubleshooting.

joe-elliott commented 2 weeks ago

Thank you for this issue!

In the spirit of openness I will share that we are considering moving the helm chart into the github.com/grafana/tempo repo, but for now it belongs to github.com/grafana/helm-charts.

Moving this over.