grafana / helm-charts

Apache License 2.0
1.63k stars 2.26k forks source link

[Loki] Issues with Implementing Retention Policy and Excluding Specific Namespaces/Pods from Logging #2999

Open akramsadab opened 7 months ago

akramsadab commented 7 months ago

Kindly help! I am encountering issues with configuring the retention policy in Loki, resulting in pods entering a CrashLoopBackOff state after the implementation. Additionally, I am seeking guidance on how to exclude certain namespaces and pods from having their logs collected and stored in an S3 bucket. Loki Version: 2.9.4

failed parsing config: /etc/loki/config/config.yaml: yaml: unmarshal errors: line 17: field retention_deletes_enabled not found in type compactor.Config line 19: field retention_period not found in type compactor.Config. Use -config.expand-env=true flag if you want to expand environment variables in your config file.

Please find my configuration.

storage: bucketNames: chunks: xxxxx-loki-chunks ruler: xxxxx-loki-ruler admin: xxxxx-admin type: s3 s3: region: af-south-1 s3ForcePathStyle: false insecure: false

compactor: apply_retention_interval: 1h working_directory: /data/loki/compactor <-- after commenting retention_delete_enabled=true I face issues creating this dir structure and I'm not sure where exactly to create this dir. shared_store: s3 compaction_interval: 1h retention_enabled: true retention_delete_delay: 2h retention_delete_worker_count: 150 **#retention_deletes_enabled: true

retention_period: 72h # 3 days**

schema_config: configs:

I am looking to optimize our logging by excluding certain namespaces and pods from being logged to reduce storage and processing overhead. I am unsure whether this configuration should be applied in Loki or Promtail, and how to properly configure it.

Any guidance or examples on resolving these issues would be highly appreciated. Thank you in advance for your help!

LeandroSaldivar commented 7 months ago

Hey! Have you checked out this page? https://grafana.com/docs/loki/latest/operations/storage/retention/

For the retention period, make sure to configure it within limits_config:

limits_config:
  retention_period: 744h
  retention_stream:
  - selector: '{namespace="dev"}'
    priority: 1
    period: 24h
per_tenant_override_config: /etc/overrides.yaml

Regarding the namespace scrape config, it seems to be a Promtail issue. I've experimented with various configs, but none seem to work. No matter what, all the namespaces' logs are still being scraped. If you come across a solution, I would appreciate it if you could share it.

This is the last config i tried:

  scrape_configs: | 
    - job_name: kubernetes-pods
      kubernetes_sd_configs:
        - role: pod
      pipeline_stages:
        - cri: {}
      relabel_configs:
        - source_labels:
            - __meta_kubernetes_namespace
          regex: ^(bps|redis)$
          action: keep
        - source_labels:
            - __meta_kubernetes_namespace
          regex: .+
          action: drop
LeandroSaldivar commented 7 months ago

Finally got the solution to the namespace filtering problem. You need to look for the extraRelabelConfigs inside config.snippets in the promtail yaml

extraRelabelConfigs: 
      - action: keep
        source_labels: 
          - namespace 
        regex: ^(namespace1|namespace2)$

And you should be good with that