grafana / helm-charts

Apache License 2.0
1.65k stars 2.27k forks source link

[loki] Ruler wont work even though rules are there and its up and running - using Helm to bring up in k8s #775

Open UrosCvijan opened 3 years ago

UrosCvijan commented 3 years ago

We have k8s cluster and in it we used help to bring up and create loki as a single binary. It is working fine with Azure blob as storage and we wanted to include a ruler so alerts are sent to our alertmanager. We followed all the instructions, created the rules, it created configMap for those mounted it on the pod, and all, the ruler is started but for some reason, it is not evaluating the rules nor sending the alerts. Below is the config from the values.yaml and below that an info/error from the startup od the loki. I noticed that configMap for alerts is mounted inside of the /rules directory and that is the default location either way from the docs. Is there anything we are missing.

Loki-stack chart version: loki-stack-2.4.1 App version: v2.1.0

loki:
  enabled: true
  config:
    schema_config:
      configs:
      - from: "2021-03-03"
        store: boltdb-shipper
        object_store: azure
        schema: v11
        index:
          prefix: index_
          period: 24h
    storage_config:
      azure:
        account_key: ${STORAGE_ACCOUNT_KEY}
        account_name: ${STORAGE_ACCOUNT}
        container_name: ${STORAGE_CONTAINER}
        request_timeout: 0  
      boltdb_shipper:
        active_index_directory: /data/loki/boltdb-shipper-active
        cache_location: /data/loki/boltdb-shipper-cache
        cache_ttl: 24h
        shared_store: azure
    compactor:
      working_directory: /data/loki/boltdb-shipper-compactor
      shared_store: azure
    ruler:
      storage:
        type: azure
        azure:
          environment: AzureGlobal
          container_name: ruler
          account_key: ${STORAGE_ACCOUNT_KEY}
          account_name: ${STORAGE_ACCOUNT}
      rule_path: /rules
      alertmanager_url: http://kube-prometheus-stack-alertmanager:9093
      ring:
        kvstore:
          store: inmemory
      enable_api: true
      enable_alertmanager_v2: true

  alerting_groups:
    - name: example
      rules:
      - alert: HighThroughputLogStreams
        expr: sum by(container) (rate({job=~"observability/loki"}[1m])) > 1000
        for: 2m  

There is an error like it cant remove user directory (which itself created during startup.

level=info ts=2021-11-05T09:01:49.752708389Z caller=mapper.go:38 msg="cleaning up mapped rules directory" path=/rules
level=warn ts=2021-11-05T09:01:49.75279869Z caller=mapper.go:51 msg="unable to remove user directory" path=/rules/..2021_11_05_09_01_48.652771454 err="unlinkat /rules/..2021_11_05_09_01_48.652771454: read-only file system"
level=info ts=2021-11-05T09:01:49.756396748Z caller=module_service.go:59 msg=initialising module=memberlist-kv
level=info ts=2021-11-05T09:01:49.75649075Z caller=module_service.go:59 msg=initialising module=store
level=info ts=2021-11-05T09:01:49.75652265Z caller=module_service.go:59 msg=initialising module=server
level=info ts=2021-11-05T09:01:49.758530382Z caller=module_service.go:59 msg=initialising module=ring
level=info ts=2021-11-05T09:01:49.758643784Z caller=module_service.go:59 msg=initialising module=ingester
level=info ts=2021-11-05T09:01:49.758739086Z caller=lifecycler.go:521 msg="not loading tokens from file, tokens file path is empty"
level=info ts=2021-11-05T09:01:49.758827187Z caller=client.go:247 msg="value is nil" key=collectors/ring index=1
level=info ts=2021-11-05T09:01:49.758926189Z caller=lifecycler.go:550 msg="instance not found in ring, adding with no tokens" ring=ingester
level=info ts=2021-11-05T09:01:49.759301695Z caller=module_service.go:59 msg=initialising module=compactor
level=info ts=2021-11-05T09:01:49.759815103Z caller=lifecycler.go:397 msg="auto-joining cluster after timeout" ring=ingester
level=info ts=2021-11-05T09:01:49.760100108Z caller=module_service.go:59 msg=initialising module=ingester-querier
level=info ts=2021-11-05T09:01:49.760163609Z caller=module_service.go:59 msg=initialising module=distributor
level=info ts=2021-11-05T09:01:49.760314211Z caller=module_service.go:59 msg=initialising module=ruler
level=info ts=2021-11-05T09:01:49.760390912Z caller=ruler.go:403 msg="ruler up and running"
level=info ts=2021-11-05T09:01:49.760570015Z caller=module_service.go:59 msg=initialising module=table-manager
level=info ts=2021-11-05T09:01:49.760919921Z caller=loki.go:248 msg="Loki started"
UrosCvijan commented 2 years ago

Any update on this? Am i doing anything wrong here?