Closed GhaziTriki closed 3 weeks ago
Hey,
Sure. I can see if I can do this tonight. I have tested it with Zabbix 6.4 and 7.0. I cannot guarantee that everything will work perfectly as I have just had our first mini-cluster. '
But TLDR. import the yaml to the templates in Zabbix and an these macros need to be populated on the hosts {$MINIO.DNS.NAME} {$MINIO.DNS.WILDCARD} {$S3.DRIVES.PER.ERASURE.SET} {$S3.MINIO.ACCESS.KEY} {$S3.MINIO.API.URL} {$S3.MINIO.MAX.DEAD.DRIVES} # this will be replaces in a future version. because i want to calculate the max dead drives via the parity and setsize metrics {$S3.MINIO.PARITY}
Great @KjellWolf, we have 5 bare-metal cluster, with 1 load balancer among them. I would be happy to be your beta-tester 😁
Ive updates the comment it seems i have made the change already... :D If you have ideas for Triggers let me know too.
And maybe nice to know is that this uses the Prometheus metrics
Added a small Readme. If Something isnt clear, just hit me up.
I tried now but I got "Cannot perform request: Received HTTP/0.9 when not allowed". Can the readme be clearer?
Hey
i dont know exactly at wich step this comes but i think its about the {$S3.MINIO.API.URL} Macro
did you specify https. Like https://my.loadbalancer.dev ? Can you share as much as possible about the setup?
when i unserstand hwere this comes from I`m happy to update the Readme.
Currently I have the following cluster
Let's assume the following:
Based on that I put the config macros:
/etc/default/minio
MINIO_STORAGE_CLASS_STANDARD=EC:3
MINIO_STORAGE_CLASS_RRS=EC:2
minio
job name in /etc/prometheus/prometheus.yml
What would be the right configuration?
Oh I think now i Know where i cloud have messed up.
in my example config i install the Template on Each Node. (While i think a solution to run LB / Standalone only would be prefeerable)
So a conf for one node wuld be
{$MINIO.DNS.NAME} = node1.minio.example.dev:9000 # specified with the internalö communication port Default 9000
{$MINIO.DNS.WILDCARD} = node1*
{$S3.MINIO.ACCESS.KEY} = Here is a difference. I do not run Prometheus directly on the nodes.
i got mine with this command mc admin config get [alias] prometheus
{$S3.MINIO.API.URL} = I setup with http / https at the start
Drives Per Set and Parity are not important for the connection. just for the calculations. but it still should work.
I just tore down my testminio for a server swap. Will start testing soon. But hope this helps?
It is a good idea to monitor each node separately. Maybe the wording of the docs wasn't clear. I have Prometheus installed on the load balancer. However it makes sense to do it like you did. Let me give a try.
Yea so can every node monitor metrics from the other one.
Just the Triggers will alert for every node, Like if 1 Drive Fails, a 4 node cluster, zabbix will be alertig 4times (for each node)
Looks better now. A question: {$S3.MINIO.API.URL} = I setup with http / https at the start, you mean the HTTPS URL of the node itself?
The following metrics are failing
Minio S3 Software Commit Info (Hash) MinIO S3 Cluster Objects Size Distribution BETWEEN_1_MB_AND_10_MB
MinIO S3 Cluster Objects Size Distribution BETWEEN_10_MB_AND_64_MB MinIO S3 Cluster Objects Size Distribution BETWEEN_64_KB_AND_256_KB MinIO S3 Cluster Objects Size Distribution BETWEEN_64_MB_AND_128_MB MinIO S3 Cluster Objects Size Distribution BETWEEN_128_MB_AND_512_MB MinIO S3 Cluster Objects Size Distribution BETWEEN_256_KB_AND_512_KB MinIO S3 Cluster Objects Size Distribution BETWEEN_512_KB_AND_1_MB MinIO S3 Cluster Objects Size Distribution BETWEEN_1024_B_AND_1_MB MinIO S3 Cluster Objects Size Distribution BETWEEN_1024_B_AND_64_KB MinIO S3 Cluster Objects Size Distribution GREATER_THAN_512_MB MinIO S3 Cluster Objects Size Distribution LESS_THAN_1024_B MinIO S3 Cluster Objects Version Distribution BETWEEN_2_AND_10 MinIO S3 Cluster Objects Version Distribution BETWEEN_10_AND_100 MinIO S3 Cluster Objects Version Distribution BETWEEN_100_AND_1000 MinIO S3 Cluster Objects Version Distribution BETWEEN_1000_AND_10000 MinIO S3 Cluster Objects Version Distribution GREATER_THAN_10000 MinIO S3 Cluster Objects Version Distribution SINGLE_VERSION MinIO S3 Heal Time Last Activity
It would be also nice to add a tag to all the metrics with key "component" and value "minio" to easily filter them.
Just for your eyes looks perfect !
I will open another ticket for improvements.
API URL means the LB in my setup. bc my zabbix cant reach the intercom vlan network of the minio setup. but to call the nodes directly (with fitting port) should work to. But no gurantee at this point.
Adding the tags is a WIP, there i give you als right to complain. but wasnt a priority until now.
Sometimes the Template need 3-4 round to get all data. Some data gets createt over a time period or plainly just does not exists. therefore i would need an export of all Prometheus data generated to check it.
I`m happy for further imrovements. Ill keep this issue Open for the Tags and Making the Readme clearer. THX for the input do far :)
Long time no see! Sorry got really ill.
Have now added the component:minio tag.
Hope this resolves this conversation!
Hello,
We want to use this in Zabbix 7.0. Could you please explain the configuration in the README file?