canonical / kfp-operators

Kubeflow Pipelines Operators
Apache License 2.0
2 stars 12 forks source link

KFP API integration test with grafana and prometheus need to be addressed #204

Closed i-chvets closed 1 year ago

i-chvets commented 1 year ago

Description

KFP API integration tests with grafana and prometheus need to be addressed. grafana-k8s stuck in unknonw state. Test times out. Currently, it is skipped. Needs to be reviewed and re-implemented.

Logs from CI

tests/integration/test_charm.py::test_prometheus_grafana_integration 
-------------------------------- live log call ---------------------------------
INFO     juju.model:model.py:2088 Deploying ch:amd64/focal/prometheus-k8s-103
INFO     juju.model:model.py:2088 Deploying ch:amd64/focal/grafana-k8s-64
INFO     juju.model:model.py:2088 Deploying ch:amd64/focal/prometheus-scrape-config-k8s-39
INFO     juju.model:model.py:2[71](https://github.com/canonical/kfp-operators/actions/runs/4756791884/jobs/8453638029#step:5:71)5 Waiting for model:
  grafana-k8s/0 [allocating] waiting: installing agent
  prometheus-k8s/0 [allocating] waiting: installing agent
  prometheus-scrape-config-k8s/0 [allocating] waiting: installing agent
INFO     juju.model:model.py:2715 Waiting for model:
  grafana-k8s/0 [allocating] waiting: agent initializing
  prometheus-k8s/0 [idle] waiting: Waiting for resource limit patch to apply
  prometheus-scrape-config-k8s/0 [executing] active: 
INFO     juju.model:model.py:2715 Waiting for model:
  grafana-k8s/0 [executing] unknown: 
. . .
  prometheus-k8s/0 [idle] waiting: Waiting for resource limit patch to apply
INFO     juju.model:model.py:2715 Waiting for model:
  grafana-k8s/0 [idle] unknown: 
  prometheus-k8s/0 [idle] blocked: Failed to apply resource limit patch: Unauthorized
INFO     juju.model:model.py:2715 Waiting for model:
  grafana-k8s/0 [idle] unknown: 
. . . 
  prometheus-k8s/0 [idle] blocked: Failed to apply resource limit patch: Unauthorized
INFO     juju.model:model.py:2715 Waiting for model:
  grafana-k8s/0 [idle] unknown:
FAILED
------------------------------ live log teardown -------------------------------
INFO     pytest_operator.plugin:plugin.py:756 Model status:

Model            Controller                Cloud/Region        Version  SLA          Timestamp
test-charm-q6wo  github-pr-b04fb-microk8s  microk8s/localhost  2.9.42   unsupported  18:06:25Z

App                           Version                Status   Scale  Charm                         Channel         Rev  Address         Exposed  Message
grafana-k8s                                          waiting      1  grafana-k8s                   stable           64  10.152.183.4    no       installing agent
kfp-api                                              waiting      1  kfp-api                                         0  10.152.183.249  no       installing agent
kfp-db                        mariadb/server:10.3    active       1  charmed-osm-mariadb-k8s       stable           35  10.152.183.250  no       ready
kfp-viz                       res:oci-image@3de6f3c  waiting      1  kfp-viz                       2.0/stable      281  10.152.183.9    no       
minio                         res:oci-image@1755999  waiting      1  minio                         ckf-1.7/stable  186  10.152.183.122  no       
prometheus-k8s                                       waiting      1  prometheus-k8s                stable          103  10.152.183.251  no       installing agent
prometheus-scrape-config-k8s  n/a                    active       1  prometheus-scrape-config-k8s  stable           39  10.152.183.52   no       

Unit                             Workload     Agent  Address      Ports              Message
grafana-k8s/0*                   unknown      idle   10.1.109.23                     
kfp-api/0*                       maintenance  idle   10.1.109.9                      Workload failed health check
kfp-db/0*                        active       idle   10.1.109.15  3306/TCP           ready
kfp-viz/0*                       waiting      idle   10.1.109.16  8888/TCP           waiting for container
minio/0*                         waiting      idle   10.1.109.18  9000/TCP,9001/TCP  waiting for container
prometheus-k8s/0*                blocked      idle   10.1.109.20                     Failed to apply resource limit patch: Unauthorized
prometheus-scrape-config-k8s/0*  active       idle   10.1.109.22    
i-chvets commented 1 year ago

Fixed is merged.