Get cluster and node details from the Proxmox API and report them to Zabbix using zabbix_sender.
The script can run on any host with Python, a functional zabbix_sender and access to the Proxmox API. A Zabbix server or Zabbix proxy would be logical candidates.
pip install proxmoxer
pip install requests
pveum useradd zabbix@pve -comment "Zabbix monitoring user"
pveum passwd zabbix@pve
pveum aclmod / -user zabbix@pve -role PVEAuditor
crontab -e -u zabbix
0 */4 * * * /usr/lib/zabbix/bin/proxmox_cluster.py -a pmx01.your.tld -u zabbix@pve -p password -s -t proxmox.tokyo.prod -d
*/10 * * * * /usr/lib/zabbix/bin/proxmox_cluster.py -a pmx01.your.tld -u zabbix@pve -p password -s -t proxmox.tokyo.prod
The script accepts the following parameters:
Getting all vHDD information requires parsing the full VM configuration. That results in one additional API call for each VM to retrieve the configuration. Subsequent processing relies heavily on regular expressions. As this is an expensive process it is optional and can be enabled by specifying -e on the command line.
Resources allocated to templates are not included in the total vCPU, vHDD and vRAM numbers reported to zabbix.
If there is no load balancer fronting the API it would make sense to use multiple scheduled tasks using different Proxmox servers. This would distribute the load and ensure Zabbix remains updated during maintenance or downtime of a host. An example using cron would look as follows:
# Item updates every 10 minutes
0,20,40 * * * * /usr/lib/zabbix/bin/proxmox_cluster.py -a pmx01.your.tld -u zabbix@pve -p password -t proxmox.tokyo.prod
10,30,50 * * * * /usr/lib/zabbix/bin/proxmox_cluster.py -a pmx02.your.tld -u zabbix@pve -p password -t proxmox.tokyo.prod
# LLD updates every 4 hours
23 0,8,16 * * * /usr/lib/zabbix/bin/proxmox_cluster.py -a pmx01.your.tld -u zabbix@pve -p password -t proxmox.tokyo.prod -d
38 4,12,20 * * * /usr/lib/zabbix/bin/proxmox_cluster.py -a pmx02.your.tld -u zabbix@pve -p password -t proxmox.tokyo.prod -d
One of the zabbix item keys in the script, and template, is prefixed promox
. That is obviously a typo but changing it would mean breaking compatability with existing installations. Changing the key in zabbix would mean losing historical data which is also undesirable. This is purely a cosmetic issue but if desirable you can of course change the prefix for those items. In that case also make sure that the keys in the template are updated accordingly.
If you define the zabbix monitor user in Linux instead of Proxmox the -u parameter would have to reflect that by using the pam realm: zabbix@pam
.
Storage monitoring was added to the script later and is not enabled by default to maintain compatibility. Use the -s parameter to enable it. This needs to be done for both the discovery and metric collection invocations. For existing installations the proxmox_cluster_template.xml needs to be imported again as it contains new discovery rules. Alternatively you can import the proxmox_cluster_storage_addon_template.xml and attach it to your Proxmox cluster host as an additional template. This can be useful if the cluster template was renamed after the original import.
Minimum requirements Proxmox 5, Python 3.7 and Zabbix 3.0.
Verified with Proxmox 6, Python 3.9 and Zabbix 5.0.
The first step when diagnosing issues is to ensure that zabbix_sender is working and the target host in zabbix is configured correctly. Try the following command on the host where the script is going to run. This should return "processed: 1; failed: 0":
[user@zabbix ~]# /usr/bin/zabbix_sender -v -c /etc/zabbix/zabbix_agentd.conf -s proxmox.tokyo.prod -k promox.cluster.quorate -o 1
Response from "127.0.0.1:10051": "processed: 1; failed: 0; total: 1; seconds spent: 0.000036"
sent: 1; skipped: 0; total: 1
The value for the -s parameter is the host you configured in the zabbix GUI to receive the data and attached the template to. That is the value you should use for the -t parameter with the script. (please note that the key value of the -k parameter is currently indeed promox.cluster.quorate an unfortunate typo mentioned under notes as well).
There have been reports of zabbix_sender returning a partial fail (2) exit status when sending discovery data. While this results in the script reporting an error the discovery data is actually processed by the zabbix server.
You can test sending the recovery data manually as follows:
[user@zabbix ~]# /usr/bin/zabbix_sender -v -c /etc/zabbix/zabbix_agentd.conf -s proxmox.tokyo.prod -k proxmox.nodes.discovery -o '{"data": [{"{#NODE}": "pve01"}, {"{#NODE}": "pve02"}, {"{#NODE}": "pve03"}]}'
We have been unable to replicate the issue. However the error does not affect the overall functionality. Nodes are discovered and will populate in Zabbix, but this script will also exit with a non-zero value. If that causes issues in cron you can use the -i parameter to ignore non-zero zabbix_sender return codes when sending the discovery data.
This software is licensed under GNU General Public License v3.0