On 3scale SaaS we use AWS ClientVPN to manage the VPN access to our different VPCs on different AWS Accounts, so we need to have monitoring around it.
For that reason this PR add supports to AWSClientVPN service, by providing:
Some AWSClientVPN grafana dashboard panels with more important metrics
Metrics configuration file with required AWSClientVPN metrics, and some hints like using period_seconds: 300 because the metrics from that Service are reported to CW every 5min instead of default 1min.
An example of AWSClientVPN PrometheusRule to monitor CRL expiration in days, as we manage the CRL with an external tool that update the CRL regularly, but it can fail and we need to know if it fails. As this AWSClientVPN metrics are not reportted much often, sometimes appear gaps in the timeseries database, so it is recommended using promql queries with functions like max_over_time(aws_clientvpn_crl_days_to_expiry_average[10m]) < 2, which takes max value within last 10 minutes, so we guarantee there is always a value that can fire an alert that won't disappear from time to time although alert might not be really recovered.
Documentation about those services reporting metrics less often than others, which require an special finetunning
On 3scale SaaS we use AWS ClientVPN to manage the VPN access to our different VPCs on different AWS Accounts, so we need to have monitoring around it.
For that reason this PR add supports to AWSClientVPN service, by providing:
period_seconds: 300
because the metrics from that Service are reported to CW every 5min instead of default 1min.max_over_time(aws_clientvpn_crl_days_to_expiry_average[10m]) < 2
, which takes max value within last 10 minutes, so we guarantee there is always a value that can fire an alert that won't disappear from time to time although alert might not be really recovered.