voxpupuli / puppet-prometheus

Puppet module for prometheus
https://forge.puppet.com/puppet/prometheus
Apache License 2.0
59 stars 237 forks source link

"job_name: prometheus" duplication #730

Closed sahaqaa closed 1 month ago

sahaqaa commented 1 month ago

Affected Puppet, Ruby, OS and module versions/distributions

How to reproduce (e.g Puppet code you use)

content of Puppetfile

mod 'puppet-prometheus', '14.0.0'

content of "data/roles/prometheus.yaml"

classes:
  - role::prometheus

profile::prometheus::alertmanager_url:
  - 'localhost:9093'

profile::prometheus::prometheus_version: '2.36.2'
profile::prometheus::alertmanager_version: '0.24.0'
profile::prometheus::prometheus_user: 'prometheus'

profile::prometheus::scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: '30s'
    scrape_timeout: '30s'
    static_configs:
    - targets:
        - 'localhost:9090'
      labels:
        alias: 'prometheus'
    relabel_configs:
      - source_labels: [group]
        replacement: '$1'
        target_label: instance
  - job_name: 'push_gateway'
    scrape_interval: '30s'
    scrape_timeout: '30s'
    honor_labels: true
    static_configs:
    - targets:
      - 'localhost:9091'
      labels:
        alias: 'pushgateway'

content of "site/profile/manifests/prometheus.pp"

  $prometheus_version  = lookup('profile::prometheus::prometheus_version', String)
  $prometheus_user     = lookup('profile::prometheus::prometheus_user', String)
  $scrape_configs      = lookup('profile::prometheus::scrape_configs', Array[Hash], 'deep')

  $alertmanager_url    = lookup('profile::prometheus::alertmanager_url', Tuple)
  $route_prefix        = lookup('profile::prometheus::prometheus_route_prefix', String)
  $alerts              = lookup('profile::prometheus::alerts')
  $hostname            = $trusted['certname']

  $external_url        = "https://${trusted['certname']}/prometheus"
  $monitor             = regsubst($trusted['certname'], '\..*$', '')

  class { 'prometheus':
    manage_prometheus_server => true,
    max_open_files           => 49152,
    extra_groups             => ['exporterauth'],
    global_config            => {
      'scrape_interval'     => '1m',
      'scrape_timeout'      => '10s',
      'evaluation_interval' => '30s',
      'external_labels'     => { 'monitor' => $monitor },
    },
    external_url             => $external_url,
    web_route_prefix         => $route_prefix,
    version                  => $prometheus_version,
    web_listen_address       => '127.0.0.1:9090',
    alerts                   => $alerts,
    scrape_configs           => $scrape_configs,
    storage_retention        => '1440h',

    alertmanagers_config     => [
      {
        'static_configs' => [{ 'targets' => $alertmanager_url }],
      },
    ],
  }

What are you seeing

Previously we had Puppet server 5.3.11 with "puppet-prometheus" module version 10.2.0. We are doing upgrade to Puppet 8 and used latest version of "puppet-prometheus" module version -> 14.0.0

What we have seen is that after puppet agent applied changes (against Puppet 8), systemd service was not running and giving error:

level=error msg="Error loading config (--config.file=/etc/prometheus/prometheus.yaml)" file=/etc/prometheus/prometheus.yaml err="parsing YAML file /etc/prometheus/prometheus.yaml: found multiple scrape configs with job name \"prometheus\""

What we did is opened our puppet-control-repo, searched there and found only 1 occurrence. But when we opened file /etc/prometheus/prometheus.yaml we saw that there are actually two "job_name: prometheus" - one is the one that we define and the second one is of unknown origin.

When we commented unknown "job_name: prometheus" and started prometheus systemd service - all worked good, no errors etc.

So for now what we did is added "chattr +i /etc/prometheus/prometheus.yaml" as temp workaround.

But i'm curios why this unknown "job_name: prometheus" appeared. When we run puppet agent it wants to do next change:

Notice: /Stage[main]/Prometheus::Config/File[prometheus.yaml]/content: 
--- /etc/prometheus/prometheus.yaml 2024-05-30 08:35:25.892338099 +0000
+++ /tmp/puppet-file20240530-3798265-j03a9g 2024-05-30 08:46:09.698061743 +0000
@@ -8,14 +8,14 @@
 rule_files:
 - "/etc/prometheus/alert.rules"
 scrape_configs:
-  #- job_name: prometheus
-  #  scrape_interval: 10s
-  #  scrape_timeout: 10s
-  #  static_configs:
-  #  - targets:
-  #    - localhost:9090
-  #    labels:
-  #      alias: Prometheus
+- job_name: prometheus
+  scrape_interval: 10s
+  scrape_timeout: 10s
+  static_configs:
+  - targets:
+    - localhost:9090
+    labels:
+      alias: Prometheus
 - job_name: prometheus
   scrape_interval: 30s
   scrape_timeout: 30s

What behavior did you expect instead

No unknown "job_name: prometheus" duplicate should appear

Output log

Any additional information you'd like to impart

sahaqaa commented 1 month ago

UPD: checked code and found that module has "$include_default_scrape_configs" i've set it to "false" and now all is good. i guess i was too quick creating GitHub issue :-)