Prometheus collector and exporter for metrics extracted from the Slurm resource scheduling system.
This project was forked from https://github.com/vpenso/prometheus-slurm-exporter and, for now, aims to be backwards-compatible from SLURM 23.11 forward. This means the existing Grafana Dashboard should plug directly into this exporter and work roughly the same.
Unlike previous slurm exporters, this project leverages the SLURM REST API (slurmrestd
) for data retreival.
Due to that difference, you are no longer required to run this exporter on a cluster node, as the exporter does not depend on having SLURM installed or connected to the head node!
I will be releasing containerized versions of this exporter soon.
This repository contains precompiled binaries for the three most recent major versions of SLURM (Note: currently only two versions, but will be three when 24.11 releases).
In the releases page, download the newest version of the exporter that matches your SLURM version.
The included systemd file assumes you've saved this binary to /usr/local/sbin/prometheus-slurm-exporter
, so drop it there or take note to change the systemd file if you choose to use it.
The expoter requires several environment variables to be set:
SLURM_EXPORTER_LISTEN_ADDRESS
This should be the full address for the exporter to listen on.
Default: 0.0.0.0:8080
SLURM_EXPORTER_API_URL
This is the URL to your slurmrestd server.
Example: http://head1.domain.edu:6820
SLURM_EXPORTER_API_USER
The user specified in the token command.
SLURM_EXPORTER_API_TOKEN
This is the SLURM token to authenticate against slurmrestd.
The easiest way to generate this is by running the following line on your head node:
scontrol token username=myuser lifespan=someseconds
myuser
should probably be the slurm
user, or some other privileged account.
lifespan
is specified in seconds. I set mine for 1 year (lifespan=31536000
).
A systemd unit file is included for ease of deployment.
This unit file assumes you've written your environment variables to /etc/prometheus-slurm-exporter/env.conf
in the format:
SLURM_EXPORTER_API_URL="http://head.domain.edu:6820"
SLURM_EXPORTER_API_USER="root"
SLURM_EXPORTER_API_TOKEN="mytoken"
Don't forget to chmod 600 /etc/prometheus-slurm-exporter/env.conf
!
This is an example scrape config for your prometheus server:
scrape_configs:
- job_name: 'slurm_exporter'
scrape_interval: 30s
scrape_timeout: 30s
static_configs:
- targets: ['exporter_host.domain.edu:8080']
The dashboard published by the previous author should work the same with this exporter. I will be releasing a new version of the dashboard soon that will receive new features.
Check out the CONTRIBUTING.md document.