This adds a monitoring module that sets up the monitoring for the watchdog function within GCP. It includes:
A log based metric that counts the number of health check events received by the function
An alert policy that triggers whenever an event hasn't been emitted for more than 6 hours
A notification channel where the alerts above are sent (victorops)
The only "manual" part is the creation of the webhook url which is a one time thing done in victorops and has to be configured as a variable in terraform.
I tested this by temporarily setting the aggregation time of the metric to 5 minutes and disabling the quicknode notification. The alert was correctly sent to victorops and it auto resolved after re-enabling the quicknode notifications.
This adds a monitoring module that sets up the monitoring for the watchdog function within GCP. It includes:
The only "manual" part is the creation of the webhook url which is a one time thing done in victorops and has to be configured as a variable in terraform.
I tested this by temporarily setting the aggregation time of the metric to 5 minutes and disabling the quicknode notification. The alert was correctly sent to victorops and it auto resolved after re-enabling the quicknode notifications.