cortexproject / cortex

A horizontally scalable, highly available, multi-tenant, long term Prometheus.
https://cortexmetrics.io/
Apache License 2.0
5.48k stars 802 forks source link

How to integrate ruler with cortex #2631

Closed ajayanaj closed 4 years ago

ajayanaj commented 4 years ago

Hello,

I'm looking for some documents to configure ruler with cortex, but I couldn't find much documentation for the ruler configuration. Is there any documentation for the ruler configuration?

Also, I would like to know how this ruler works? is it the same way Prometheus evaluates rules from a particular directory? (In prometheus we can configure the alert rules read from a particular directory and push the alerts to alertmanager) eg: rule_files: some-path/*.yaml

Thank you in advance.

gotjosh commented 4 years ago

Hi @ajayanaj!

Thank you for expressing your interest in the ruler.

Is there any documentation for the ruler configuration?

Some documentation exists in the official Cortex website but I could see a world where there might missing pieces. After going through those, can you let me know what's missing?

Also, I would like to know how this ruler works? is it the same way Prometheus evaluates rules from a particular directory? (In prometheus we can configure the alert rules read from a particular directory and push the alerts to alertmanager)

It's fairly similar. You'll see in the documentation that instead of providing the file directly from the filesystem, there's an API endpoint for sending the file to the Ruler.

Let me know here or in the CNCF Slack if you encounter any other issues while setting it up.

ajayanaj commented 4 years ago

hey @gotjosh thanks for your reply! I was going through the the shared documentation and I came to know that the alerts are managing via API only. I just tried to setup a single instance cortex (using https://cortexmetrics.io/docs/getting-started/getting-started-chunks-storage/) to create some alerts, but i'm not sure how I push the alerts. Like I'm stuck at this "namespace" and "groupname". I believe this groupname is same as in prometheus eg: groups:

It would be helpful if the documentation has more details like what is namespace, how we can create alerts (one small example would be enough), how this alerts will route to alertmanager(same as label service?)

Thank you in advance.

ajayanaj commented 4 years ago

@gotjosh I just setup a single instance cortex using https://cortexmetrics.io/docs/getting-started/getting-started-blocks-storage/ but I'm unable to use ruler/alertmanager on top of it. When I access localhost:9009/prometheus or http://localhost:9009/multitenant_alertmanager/status it's throwing 404. The services not showing ruler/alertmanaer. curl -XGET localhost:9009/services table-manager => Running querier => Running memberlist-kv => Running server => Running runtime-config => Running store => Running ring => Running query-frontend => Running ingester => Running distributor => Running

Here is my config `ruler: evaluation_interval: 30s storage: configdb: configs_api_url: "" client_timeout: 5s alertmanager_url: http://localhost:9093 rule_path: /rules enable_alertmanager_discovery: false alertmanager_refresh_interval: 1m0s enable_alertmanager_v2: false notification_queue_capacity: 10000 notification_timeout: 10s enable_sharding: false search_pending_for: 5m0s

configs: database: uri: postgres://psql@localhost/cortex?sslmode=disable migrations_dir: "" password_file: "" api: notifications: disable_email: false disable_webhook: false

alertmanager: data_dir: /home/ubuntu/data retention: 120h0m0s external_url: "http://xx.xx.xx.xx:9093/" poll_interval: 15s storage: type: configdb configdb: configs_api_url: "" client_timeout: 5s`

It would be helpful if you can share how to set up ruler and alertmanager with cortex. Thanks in advance

mvkrishna86 commented 3 years ago

Hi @ajayanaj , I am also in the same place where you were a year ago. Were you able to proceed on this? If so, can you please give me an example of how to configure the alerting rules. And what is namespace here?

ajayanaj commented 3 years ago

@mvkrishna86 Yes, I'm able to run the ruler with alertmanager(not cortex alertmanager), here is the example I used

containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          args:
          - -target=ruler
          - -log.level=debug
          - -ruler.rule-path=/rules
          - -server.http-listen-port=80
          - -ruler.poll-interval=1m
          - -ruler.notification-queue-capacity=10000
          - -ruler.notification-timeout=30s
          - -ruler.enable-sharding=true
          - -ruler.ring.consul.hostname={{ .Values.consul }}
          - -ruler.storage.type=s3
          - -ruler.storage.s3.url={{ .Values.s3storageurl }}
          - -ruler.storage.s3.buckets={{ .Values.s3bucket }}
          - -ruler.storage.s3.force-path-style=true
          - -ruler.alertmanager-url={{ .Values.alertmanager }}
          - -consul.hostname={{ .Values.consul }}
          - -experimental.ruler.enable-api=true
          - -schema-config-file=/etc/cortex/schema.yaml
          - -dynamodb.url={{ .Values.dynamodb }}
          - -s3.url={{ .Values.s3url }}
          -

When you create a rule in ruler, it will create like this /rules/tenant_id/alert-name