DnPlas commented 1 year ago

Finalise work done for istio-pilot during Obeservability Workshop

Branch for integration with grafana, prometheus: https://github.com/canonical/istio-operators/tree/KF-819-istio-pilot-cos-integration Branch for alert rules: https://github.com/canonical/istio-operators/tree/KF-819-istio-pilot-alert-rules Prometheus deployment https://github.com/canonical/prometheus-k8s-operator

Design

Failure alerts are implemented through integration with Prometheus Charm from Canonical Observability Stack. Prometheus creates scrape jobs based on configured alert rules defined by istio-pilot (istiod) Charm. Then it scrapes targets, retrieves defined metrics, and performs required calculations.

Testing

Setup MicroK8S cluster and Juju controller:

microk8s enable dns storage metallb:"10.64.140.43-10.64.140.49,192.168.0.105-192.168.0.111"
juju bootstrap microk8s uk8s
juju add-model test

Deploy Prometheus and istio-pilot and relate them.

juju deploy prometheus-k8s --trust
charmcraft pack -v -p ./charms/istio-pilot # assuming you are in the root of this repo
charmcraft pack -v -p ./charms/istio-gatewat
juju deploy ./istio-pilot_ubuntu-20.04-amd64.charm --trust --default-gateway=test
juju deploy ./istio-gateway_ubuntu-20.04-amd64.charm istio-ingressgateway --trust --kind=ingress
juju relate istio-ingressgateway istio-pilot
juju relate prometheus-k8s istio-pilot

Navigate to Prometheus dashboard https://<Prometheus-unit-IP>:9090, select Status->Targets There should be Promethus scrape job that targets Argo Workflows metrics endpoint (http://<istiod-IP>:15014/metrics) entry with no errors.

Received alerts can also be verified under Alerts tab.

i-chvets commented 1 year ago

Jira

DnPlas commented 1 year ago

Fixed by #132 and #133

canonical / istio-operators