Open valeriansaliou opened 5 years ago
It would be nice if you could see the last time there was a problem for a check and how long it lasted.
It oowuld be amazing if you could modernize the status page + add a GUI backend.
@aguilaair modernize? Can you explain what's wrong w/ the current one?
There's nothing wrong with it but an even more minimal design would be awesome.
Hello @valeriansaliou,
Is this thread active, yet?
I am currently using vigil and I found it awesome. To enhance it, there are some ideas which came to me when I was deploying vigil.
I think that we could add a field "kind" to a probe which is an enum. This allows us to add a new type of check like a heartbeat. This probe will allow us to monitor system which could not be handled by TCP or HTTP checks.
Besides, It will be great to allow to specify the expected status when setting an HTTP check. For example, this allows monitoring API under a load balancer.
By the way, I will be glad to help you with this.
Hey @FlorentinDUBOIS ; glad to meet you there :)
Great idea. I accept PRs for this ๐
On the expected HTTP status, it's already possible w/ poll_http_status_healthy_above
and poll_http_status_healthy_below
though those are global settings.
I do not see those settings, I will try it.
Thanks :D
~It would be great if it was possible to autodiscover/iterate over the available replicas for a particular service somehow?~
~For my specific usecase, I want to run Vigil inside a kubernetes cluster and monitor the available replicas. I think it could be generalised for a number of possible use cases where autodiscovery of the replicas based on the endpoint was possible. (Consul for another example, provides a service DNS name for several replicas.)~
~It's a bit outside Vigil's usecase but I love the style and interface you have for tracking replicas like this :)~
ignore me, it's already requested in #13 :)
It would be great to have incident history reporting, unless I have missed it somewhere :stuck_out_tongue:
Many features have been listed by other commentators, I can think of dynamic configuration and Prometheus exporter? I really liked that Vigil supports monitoring services behind NAT/Firewall.
It would be great to have possibility to monitor DB health (MySQL, MariaDB, Postgres). I know, it isn't that easy, I tried it, I've created a python script that runs a simple HTTP server, and depends on the URL request returns 200 if DB is ok (URL request looks like "http://ip/
@denisle1981 At Crisp we're using the new Vigil script
probe type to monitor DB health, connecting to the DB from the network and monitoring replication status.
Script probes were designed for all those specific monitoring use cases, which cannot be generalized due to being very specific (ie. backend-specific, I'll never add a custom Redis monitoring probe type, all those use cases should fallback to the script probe type, with a simple shell script).
And, if you cannot connect to MySQL from your Vigil server, you could use vigil-local
(running on the MySQL server itself), which would execute the script locally and report any result to Vigil: https://github.com/valeriansaliou/vigil-local
An example script from our Vigil configuration:
[[probe.service.node]]
id = "mysql-replication"
label = "MySQL replication"
mode = "script"
scripts = [
'''
status=$(timeout 5 mysql --host="<target_mysql_slave_host>" --user="<target_user>" --password="<target_password>" --execute="SHOW SLAVE STATUS\G;")
last_error=$(printf "$status" | grep "Last_Error" | cut -d':' -f2 | tr -dc '[:print:]' | sed 's/ //g')
seconds_behind=$(printf "$status" | grep "Seconds_Behind_Master" | cut -d':' -f2 | tr -dc '[:print:]' | sed 's/ //g')
if [ ! -z "$last_error" ]; then
exit 2
fi
if [ -z "$seconds_behind" ]; then
exit 2
fi
if [ "$seconds_behind" -lt "600" ]; then
exit 0
fi
exit 1
'''
]
@denisle1981 At Crisp we're using the new Vigil
script
probe type to monitor DB health, connecting to the DB from the network and monitoring replication status.Script probes were designed for all those specific monitoring use cases, which cannot be generalized due to being very specific (ie. backend-specific, I'll never add a custom Redis monitoring probe type, all those use cases should fallback to the script probe type, with a simple shell script).
And, if you cannot connect to MySQL from your Vigil server, you could use
vigil-local
(running on the MySQL server itself), which would execute the script locally and report any result to Vigil: https://github.com/valeriansaliou/vigil-localAn example script from our Vigil configuration:
[[probe.service.node]] id = "mysql-replication" label = "MySQL replication" mode = "script" scripts = [ ''' status=$(timeout 5 mysql --host="<target_mysql_slave_host>" --user="<target_user>" --password="<target_password>" --execute="SHOW SLAVE STATUS\G;") last_error=$(printf "$status" | grep "Last_Error" | cut -d':' -f2 | tr -dc '[:print:]' | sed 's/ //g') seconds_behind=$(printf "$status" | grep "Seconds_Behind_Master" | cut -d':' -f2 | tr -dc '[:print:]' | sed 's/ //g') if [ ! -z "$last_error" ]; then exit 2 fi if [ -z "$seconds_behind" ]; then exit 2 fi if [ "$seconds_behind" -lt "600" ]; then exit 0 fi exit 1 ''' ]
Thanks @valeriansaliou , I'll try it.
Would it be too difficult to support gotify for notification?
I could accept a PR on that. The code of this notifier should be quite similar to Pushover.
Alright. Time to learn rust in 2 days ๐
@valeriansaliou alright, #65 is up. So far, rust seems pretty nice (coming from Go). Still needs some time for deep dive.
May I recommend switching to Github Actions? Travis CI seems pretty slow comparing to Github Actions.
@zllovesuki thanks for the PR, the merge is all good for me now. I'll consider switching to GH Actions, yes, as Travis is not giving away free CI minutes for OSS projects anymore.
not sure if this is still open but @valeriansaliou I would love to see support for webhooks e.g. we need to send an alert from another tool doing some monitoring but want to centralize via your tool.
If youโre looking for a Web Hooks notifier, Vigil has one. You can also use the Reporter API to send your own statuses.
Hooks
I didn't see that in the docs that is awesome ty @valeriansaliou
Hello!
If anyone has quick feature suggestions or lacks a feature in Vigil, please use this thread to submit your ideas.