prometheus-community / ipmi_exporter

Remote IPMI exporter for Prometheus
MIT License
472 stars 133 forks source link

Configuration issues old vs new ipmi.yml #68

Closed hbokh closed 3 years ago

hbokh commented 3 years ago

Long time user of ipmi_exporter here! ❤️ I have finally upgraded our version dating from Aug. 2018 to v1.3.2. Now the changes in ipmi.yml have slapped me in the face pretty hard.

We use a combination of (lets make it) 56 hosts: 40 hosts with password 12345, 8 hosts with password abcde and another 8 with password 1a2b3c used to be working where the 40 hosts where used as default and the other 2x 8 ones as "specialhosts", using their unique IP addresses:

Old ipmi.yml used to be pretty small, using default as expected:

credentials:
  default:
    user: "user1"
    pass: "12345"
  10.20.30.10:
    user: "user2"
    pass: "abcde"

--- and so on for 2x 8 hosts

The only way I could get it to work was adding ALL hosts / IP addresses separately one-by-one into the newer ipmi.yml as "module", like this:

modules:
  default:
    user: user1
    pass: 12345
    driver: LAN_2_0
    privilege: user
  10.20.30.10:               # duplicate from default
    user: user1
    pass: 12345
    driver: LAN_2_0
  10.20.30.11:
    user: user2
    pass: abcde
  10.20.30.12:
    user: user3
    pass: a1b2c3
    collectors:
    - bmc
    - ipmi
    - chassis

--- etcetera - up to > 250 lines

The block with default is needed, otherwise a scrape would give

# curl "http://10.20.30.222:9290/ipmi?target=10.20.30.10"
Unknown module "default"

In file /etc/prometheus/prometheus.yml currently the job is configured like this - I freshly added the source_labels-block with __param_module as suggested here: https://github.com/soundcloud/ipmi_exporter/issues/50#issuecomment-778351848

  - job_name: 'ipmi'
    scrape_interval: 1m
    scrape_timeout: 30s
    metrics_path: /ipmi
    scheme: http
    file_sd_configs:
      - files:
        - /etc/prometheus/targets/ipmi_exporter.yml
        refresh_interval: 5m
    relabel_configs:
    - source_labels: [__address__]
      separator: ;
      regex: (.*)(:80)?
      target_label: __param_target
      replacement: ${1}
      action: replace
    - source_labels: [__address__]
      separator: ;
      regex: (.*)(:80)?
      target_label: __param_module
      replacement: ${1}
      action: replace
    - separator: ;
      regex: .*
      target_label: __address__
      replacement: ipmi_exporter.example.com:9290
      action: replace

I hope the above explains the situation where we came from and how it runs as of today.

My question: is this for now the right way to go or is there a better way? The newer ipmi.yml has grown considerably, in our case to over 260 lines.

bitfehler commented 3 years ago

Hey there! I understand that it might make your specific use case a little less convenient, but the new-style config is designed to very explicit about everything, which usually helps prevent errors in the long run. That said, you basically have three options to handle your described setup:

  1. Keep the prometheus config as is, ditch the default module entirely from the exporter config, instead listing a module for each target individually. This solution is often used in larger environments, where the exporter config is often generated from some sort of asset inventory anyways. But yeah, it may not may not be great if you maintain the config by hand.
  2. Keep the default module (by any name), and split your prometheus config instead. You could have one job with the config you posted for only the "specialhosts", and another job where you always set the module to the fixed value (the name of your "default" module):
    - target_label: __param_module
      replacement: default
      action: replace 
  3. You could keep both the simpler exporter config and the simpler prometheus config if you add labels to your targets in the prometheus config. If you look at the documentation, you can write your /etc/prometheus/targets/ipmi_exporter.yml with labels:
    - targets: [ - '<host>' ] labels: [ <labelname>: <labelvalue> ... ]

    So you could assign a label module to every target, then write a relabel rule that replaces __param_module with the value of module.

Reading your description, I think maybe option 3 might be the most suitable here?

Let me know if you need any further details or examples for any of the above approaches!

hbokh commented 3 years ago

Thanks for the response, Conrad. Much appreciated! I'll have a look after the weekend and discuss the options with my co-workers.

hbokh commented 3 years ago

For now we hold on to the current option, but are considering option 3 in a future iteration.