kyma-project / compass-manager

Manager for the compass module
Apache License 2.0
1 stars 8 forks source link

Dependency health checking #112

Open ebensom opened 5 months ago

ebensom commented 5 months ago

Description

Implement periodic health checking of compass dependency by periodically querying the oauth2 protected compass-director graphql API endpoint in a separate goroutine and keep the latest check result up-to-date. Expose the current (up-to-date) healthcheck result on the Prometheus metrics endpoint via series like:

{app}_{subsys}_compass_director_health{url="..", status="healthy"} 1
{app}_{subsys}_compass_director_health{url="..", status="error"} 0
{app}_{subsys}_compass_director_health{url="..", status="unknown"} 0

Reasons

Ability to cross-correlate compass-manager errors with director (dependency) errors.

Attachments

tobiscr commented 2 months ago

@ebensom : Please monitor the compass director API directly to detect service outages.

We will on our side implement following new metrics:

Disper commented 1 month ago

Possibly related https://github.com/kyma-project/infrastructure-manager/issues/138#issuecomment-2098179074