heidsoft / cloud-bigdata-book

write book
56 stars 33 forks source link

Prometheus+Grafana #66

Open heidsoft opened 5 years ago

heidsoft commented 5 years ago

使用

How to Write Rules for Prometheus How To Monitor Linux Servers Using Prometheus Node Exporter Monitoring your Linux Servers with Prometheus and Grafana in 7 Minutes How to Monitor Linux Server Performance with Prometheus and Grafana in 5 minutes Install Prometheus Server on CentOS 7 and Ubuntu 18.04 使用 promethues 和 grafana 监控自己的 linux 机器

参考链接

https://yunlzheng.gitbook.io/prometheus-book/parti-prometheus-ji-chu/alert/prometheus-alert-rule https://awesome-prometheus-alerts.grep.to/rules.html https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/ https://alex.dzyoba.com/blog/prometheus-alerts/ https://gist.github.com/devops-school/98d7eed1a9df6c372c45452730791f7a https://www.metricfire.com/blog/top-5-prometheus-alertmanager-gotchas/ https://www.weave.works/blog/labels-in-prometheus-alerts-think-twice-before-using-them https://yunlzheng.gitbook.io/prometheus-book/parti-prometheus-ji-chu/alert/prometheus-alert-rule https://softwareadept.xyz/2018/01/how-to-write-rules-for-prometheus/ https://blog.networktocode.com/post/prometheus_alerting/ https://www.devopsschool.com/blog/recording-rules-and-alerting-rules-exmplained-in-prometheus/ https://blog.csdn.net/shida_csdn/article/details/81980021 https://gitlab.cern.ch/paas-tools/monitoring/prometheus-webhook-receiver/-/tree/master https://superuser.com/questions/443406/how-can-i-produce-high-cpu-load-on-a-linux-server https://yunlzheng.gitbook.io/prometheus-book/parti-prometheus-ji-chu/alert/alert-manager-config https://www.jianshu.com/p/fd0b018539cd https://help.aliyun.com/document_detail/123117.html?utm_content=g_1000230851&spm=5176.20966629.toubu.3.f2991ddcpxxvD1#h2-alertmanagers88 https://github.com/prometheus/alertmanager/blob/master/api/v2/openapi.yaml https://github.com/gin-gonic/gin#quick-start https://www.programmersought.com/article/50413971111/ https://songjiayang.gitbooks.io/prometheus/content/configuration/rule_files.html

刷新reload配置

[root@localhost prometheus-2.19.1.linux-amd64]# curl -v -X POST http://172.16.59.100:9090/-/reload

heidsoft commented 5 years ago

How to Use Prometheus to Monitor Your CentOS 7 Server Linux Track NFS Directory / Disk I/O Stats

heidsoft commented 3 years ago

prometheus的relabel_configs的理解

prometheus的relabel_configs的理解 Kubernetes下的服务发现 Prometheus的服务发现机制

默认情况下,当Prometheus加载Target实例完成后,这些Target时候都会包含一些默认的标签:

 上面这些标签将会告诉Prometheus如何从该Target实例中获取监控数据。一般来说,Target以__作为前置的标签是在系统内部使用的,因此这些标签不会被写入到样本数据中。不过这里有一些例外,例如,我们会发现所有通过Prometheus采集的样本数据中都会包含一个名为instance的标签,该标签的内容对应到Target实例的__address__。 这里实际上是发生了一次标签的重写处理。

这种发生在采集样本数据之前,对Target实例的标签进行重写的机制在Prometheus被称为Relabeling。

                                                                    Relabeling作用时机

Prometheus允许用户在采集任务设置中通过relabel_configs来添加自定义的Relabeling过程。

replace/labelmap/labelkeep/labeldrop对标签进行管理
完整的relabel_config配置如下所示:

__address__:当前Target实例的访问地址<host>:<port>

__scheme__:采集目标服务访问地址的HTTP Scheme,HTTP或者HTTPS

__metrics_path__:采集目标服务访问地址的访问路径

__param_<name>:采集任务目标服务的中包含的请求参数

# The source labels select values from existing labels. Their content is concatenated
# using the configured separator and matched against the configured regular expression
# for the replace, keep, and drop actions.
[ source_labels: '[' <labelname> [, ...] ']' ]

# Separator placed between concatenated source label values.
[ separator: <string> | default = ; ]

# Label to which the resulting value is written in a replace action.
# It is mandatory for replace actions. Regex capture groups are available.
[ target_label: <labelname> ]

# Regular expression against which the extracted value is matched.
[ regex: <regex> | default = (.*) ]

# Modulus to take of the hash of the source label values.
[ modulus: <uint64> ]

# Replacement value against which a regex replace is performed if the
# regular expression matches. Regex capture groups are available.
[ replacement: <string> | default = $1 ]

# Action to perform based on regex matching.
[ action: <relabel_action> | default = replace ]
 其中action定义了当前relabel_config对Metadata标签的处理方式,默认的action行为为replace。

replace是根据regex的配置匹配source_labels标签的值(多个source_label的值会按照separator进行拼接),并且将匹配到的值写入到target_label当中,如果有多个匹配组,则可以使用${1}, ${2}确定写入的内容。如果没匹配到任何内容则不对target_label进行重新。如:

  - job_name: 'kubernetes-kubelet'

      scheme: https

      tls_config:

        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:

      - role: node

      relabel_configs:

      - target_label: __address__

        replacement: kubernetes.default.svc:443

      - source_labels: [__meta_kubernetes_node_name]

        regex: (.+)

        target_label: __metrics_path__

        replacement: /api/v1/nodes/${1}/proxy/metrics
目标标签__metrics_path_的值为/api/v1/nodes/${1}/proxy/metrics。 其中${1}是正则表达式(.+)从__meta_kubernetes_node_name的值中捕获的内容。

而labelmap会根据regex去匹配Target实例所有标签的名称(注意是名称),并且将捕获到的内容作为为新的标签名称,regex匹配到标签的的值作为新标签的值。如:

- job_name: 'kubernetes-nodes'

  kubernetes_sd_configs:

  - role: node

  relabel_configs:

  - action: labelmap

    regex: __meta_kubernetes_node_label_(.+)
原标签为: __meta_kubernetes_node_label_test=tttt

则目标标签为: test=tttt

使用labelkeep或者labeldrop则可以对Target标签进行过滤,仅保留符合过滤条件的标签,例如:

relabel_configs:
  - regex: label_should_drop_(.+)
    action: labeldrop
该配置会使用regex匹配当前Target实例的所有标签,并将符合regex规则的标签从Target实例中移除。labelkeep正好相反,会移除那些不匹配regex定义的所有标签。

使用keep/drop过滤Target实例

scrape_configs:
  - job_name: node_exporter
    consul_sd_configs:
      - server: localhost:8500
        services:
          - node_exporter
    relabel_configs:
    - source_labels:  ["__meta_consul_dc"]
      regex: "dc1"
      action: keep
上述配置表示只要指标的“__meta_consul_dc”这个标签的值含有“dc1”,就保留这个指标。

当action设置为keep时,Prometheus会丢弃source_labels的值中没有匹配到regex正则表达式内容的Target实例,而当action设置为drop时,则会丢弃那些source_labels的值匹配到regex正则表达式内容的Target实例。
heidsoft commented 3 years ago

https://opensource.actionsky.com/20200622-prometheus/ https://my.oschina.net/u/4383725/blog/4314559 https://www.cnblogs.com/zhaojiedi1992/p/zhaojiedi_liunx_61_prometheus_relabel.html https://yunlzheng.gitbook.io/prometheus-book/part-ii-prometheus-jin-jie/sd/service-discovery-with-relabel https://www.jianshu.com/p/cef0e145d3e0 https://www.iloxp.com/archive/11/