canonical / katib-operators

Operators for Katib which is part of Charmed Kubeflow.
Apache License 2.0
1 stars 3 forks source link

katib bundle integration tests stuck in the CI with `ERROR juju.client.connection:connection.py:665 RPC: Automatic reconnect failed` #209

Closed NohaIhab closed 4 months ago

NohaIhab commented 4 months ago

Bug Description

The katib bundle integration tests are never finishing execution in the CI, they are eventually getting cancelled after 6 hours (the GH runners limit). From the logs, it looks like all the tests have passed, by pytest operator is never exiting at the end. This is currently blocking the CI in main and might also affect other branches. Example run

To Reproduce

trigger a CI run in main

Environment

juju 3.4/stable microk8s 1.29-strict/stable

Relevant Log Output

See in the first comment, cannot add here due to https://github.com/canonical/gh-jira-sync-bot/issues/42

Additional Context

No response

NohaIhab commented 4 months ago

Relevant Log Output

Added 'kubeflow' model on microk8s/localhost with credential 'microk8s' for user 'admin'
bundle-integration: install_deps> python -I -m pip install -r requirements-integration.txt
bundle-integration: freeze> python -m pip freeze --all
bundle-integration: anyio==4.0.0,asttokens==2.4.0,backcall==0.2.0,bcrypt==4.0.1,cachetools==5.3.1,certifi==2023.7.22,cffi==1.15.1,charset-normalizer==3.2.0,cryptography==41.0.3,decorator==5.1.1,exceptiongroup==1.1.3,executing==1.2.0,google-auth==2.22.0,h11==0.14.0,httpcore==0.17.3,httpx==0.24.1,hvac==1.2.0,idna==3.4,iniconfig==2.0.0,ipdb==0.13.13,ipython==8.12.2,jedi==0.19.0,Jinja2==3.1.2,juju==3.2.2,kubernetes==27.2.0,lightkube==0.14.0,lightkube-models==1.28.1.4,macaroonbakery==1.3.1,MarkupSafe==2.1.3,matplotlib-inline==0.1.6,mypy-extensions==1.0.0,oauthlib==3.2.2,packaging==23.1,paramiko==2.12.0,parso==0.8.3,pexpect==4.8.0,pickleshare==0.7.5,pip==24.1,pluggy==1.3.0,prompt-toolkit==3.0.39,protobuf==3.20.3,ptyprocess==0.7.0,pure-eval==0.2.2,pyasn1==0.5.0,pyasn1-modules==0.3.0,pycparser==2.21,Pygments==2.16.1,pyhcl==0.4.5,pymacaroons==0.13.0,PyNaCl==1.5.0,pyRFC3339==1.1,pytest==7.4.2,pytest-asyncio==0.21.1,pytest-operator==0.29.0,python-dateutil==2.8.2,pytz==2023.3.post1,PyYAML==6.0.1,requests==2.31.0,requests
bundle-integration: commands[0]> pytest -v --tb native --show-capture=no --log-cli-level=INFO -s --model kubeflow /home/runner/work/katib-operators/katib-operators/tests/integration
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.4.2, pluggy-1.3.0 -- /home/runner/work/katib-operators/katib-operators/.tox/bundle-integration/bin/python
cachedir: .tox/bundle-integration/.pytest_cache
rootdir: /home/runner/work/katib-operators/katib-operators
configfile: pyproject.toml
plugins: operator-0.29.0, asyncio-0.21.1, anyio-4.0.0
asyncio: mode=strict
collecting ... collected 12 items
tests/integration/test_charms.py::test_deploy_katib_charms 
-------------------------------- live log setup --------------------------------
INFO     pytest_operator.plugin:plugin.py:675 Connecting to existing model github-pr-fb181-microk8s:kubeflow on unspecified cloud
WARNING  juju.client.connection:connection.py:927 unexpected facade UserSecretsDrain received from the controller
WARNING  juju.client.connection:connection.py:927 unexpected facade UserSecretsManager received from the controller
-------------------------------- live log call ---------------------------------
INFO     pytest_operator.plugin:plugin.py:526 Using tmp_path: /home/runner/work/katib-operators/katib-operators/.tox/bundle-integration/tmp/pytest/kubeflow0
INFO     pytest_operator.plugin:plugin.py:975 Building charm katib-controller
INFO     pytest_operator.plugin:plugin.py:980 Built charm katib-controller in 612.53s
INFO     pytest_operator.plugin:plugin.py:526 Using tmp_path: /home/runner/work/katib-operators/katib-operators/.tox/bundle-integration/tmp/pytest/kubeflow0
INFO     pytest_operator.plugin:plugin.py:975 Building charm katib-db-manager
INFO     pytest_operator.plugin:plugin.py:980 Built charm katib-db-manager in 125.68s
INFO     pytest_operator.plugin:plugin.py:526 Using tmp_path: /home/runner/work/katib-operators/katib-operators/.tox/bundle-integration/tmp/pytest/kubeflow0
INFO     pytest_operator.plugin:plugin.py:975 Building charm katib-ui
INFO     pytest_operator.plugin:plugin.py:980 Built charm katib-ui in 1007.46s
INFO     juju.model:model.py:2069 Deploying local:focal/katib-controller-0
INFO     juju.model:model.py:2069 Deploying local:focal/katib-db-manager-0
INFO     juju.model:model.py:2069 Deploying local:focal/katib-ui-0
INFO     juju.model:model.py:2069 Deploying ch:amd64/jammy/mysql-k8s-153
WARNING  juju.model:model.py:1558 relate is deprecated and will be removed. Use integrate instead.
WARNING  juju.model:model.py:1558 relate is deprecated and will be removed. Use integrate instead.
INFO     juju.model:model.py:2759 Waiting for model:
  katib-controller/0 [allocating] waiting: agent initialising
  katib-db-manager/0 [allocating] waiting: installing agent
  katib-ui/0 [allocating] waiting: installing agent
  katib-db/0 [allocating] waiting: installing agent
INFO     juju.model:model.py:2759 Waiting for model:
  katib-controller/0 [executing] maintenance: Reconciling charm: executing component kubernetes:auths-webhooks-crds-configmaps
  katib-db-manager/0 [idle] waiting: Waiting for relational-db data
  katib-ui/0 [idle] active: 
  katib-db/0 [allocating] waiting: agent initialising
INFO     juju.model:model.py:2759 Waiting for model:
  katib-controller/0 [idle] active: 
  katib-db-manager/0 [idle] waiting: Waiting for relational-db data
  katib-db/0 [executing] active: Primary
INFO     juju.model:model.py:2759 Waiting for model:
  katib-db-manager/0 [idle] active: 
  katib-db/0 [idle] active: Primary
INFO     juju.model:model.py:2069 Deploying ch:amd64/focal/kubeflow-profiles-393
INFO     juju.model:model.py:2759 Waiting for model:
  katib-controller/0 [idle] active: 
  katib-db-manager/0 [idle] active: 
  katib-ui/0 [idle] active: 
  katib-db/0 [idle] active: Primary
  kubeflow-profiles/0 [allocating] waiting: installing agent
INFO     juju.model:model.py:2759 Waiting for model:
  katib-db/0 [idle] active: Primary
  kubeflow-profiles/0 [idle] active: 
PASSED
------------------------------ live log teardown -------------------------------
INFO     pytest_operator.plugin:plugin.py:783 Model status:
Model     Controller                Cloud/Region        Version  SLA          Timestamp
kubeflow  github-pr-fb181-microk8s  microk8s/localhost  3.4.4    unsupported  10:03:25Z
App                Version                  Status  Scale  Charm              Channel      Rev  Address         Exposed  Message
katib-controller                            active      1  katib-controller                  0  10.152.183.186  no       
katib-db           8.0.36-0ubuntu0.22.04.1  active      1  mysql-k8s          8.0/stable   153  10.152.183.35   no       
katib-db-manager                            active      1  katib-db-manager                  0  10.152.183.158  no       
katib-ui                                    active      1  katib-ui                          0  10.152.183.110  no       
kubeflow-profiles                           active      1  kubeflow-profiles  latest/edge  393  10.152.183.251  no       
Unit                  Workload  Agent  Address       Ports  Message
katib-controller/0*   active    idle   10.1.244.201         
katib-db-manager/0*   active    idle   10.1.244.202         
katib-db/0*           active    idle   10.1.244.205         Primary
katib-ui/0*           active    idle   10.1.244.203         
kubeflow-profiles/0*  active    idle   10.1.244.206         
INFO     pytest_operator.plugin:plugin.py:789 Juju error logs:
unit-katib-controller-0: 10:01:08 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:08 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:09 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:09 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:09 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:13 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:14 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:15 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:16 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:16 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-db-manager-0: 10:01:16 ERROR unit.katib-db-manager/0.juju-log Failed to handle <PebbleReadyEvent via KatibDBManagerOperator/on/katib_db_manager_pebble_ready[21]> with error: Waiting for relational-db data
unit-katib-db-manager-0: 10:01:18 ERROR unit.katib-db-manager/0.juju-log Failed to handle <ConfigChangedEvent via KatibDBManagerOperator/on/config_changed[26]> with error: Waiting for relational-db data
unit-katib-db-manager-0: 10:01:38 ERROR unit.katib-db-manager/0.juju-log relational-db:3: Failed to handle <RelationJoinedEvent via KatibDBManagerOperator/on/relational_db_relation_joined[46]> with error: Waiting for relational-db data
unit-katib-db-manager-0: 10:01:40 ERROR unit.katib-db-manager/0.juju-log relational-db:3: Failed to handle <RelationChangedEvent via KatibDBManagerOperator/on/relational_db_relation_changed[51]> with error: Waiting for relational-db data
INFO     pytest_operator.plugin:plugin.py:855 Forgetting main...
ERROR    websockets.protocol:protocol.py:881 Error in data transfer
Traceback (most recent call last):
  File "/home/runner/work/katib-operators/katib-operators/.tox/bundle-integration/lib/python3.8/site-packages/websockets/protocol.py", line 827, in transfer_data
    message = await self.read_message()
  File "/home/runner/work/katib-operators/katib-operators/.tox/bundle-integration/lib/python3.8/site-packages/websockets/protocol.py", line 895, in read_message
    frame = await self.read_data_frame(max_size=self.max_size)
  File "/home/runner/work/katib-operators/katib-operators/.tox/bundle-integration/lib/python3.8/site-packages/websockets/protocol.py", line 971, in read_data_frame
    frame = await self.read_frame(max_size)
  File "/home/runner/work/katib-operators/katib-operators/.tox/bundle-integration/lib/python3.8/site-packages/websockets/protocol.py", line 1047, in read_frame
    frame = await Frame.read(
  File "/home/runner/work/katib-operators/katib-operators/.tox/bundle-integration/lib/python3.8/site-packages/websockets/framing.py", line 105, in read
    data = await reader(2)
  File "/usr/lib/python3.8/asyncio/streams.py", line 723, in readexactly
    await self._wait_for_data('readexactly')
  File "/usr/lib/python3.8/asyncio/streams.py", line 517, in _wait_for_data
    await self._waiter
  File "/usr/lib/python3.8/asyncio/selector_events.py", line 910, in write
    n = self._sock.send(data)
OSError: [Errno 9] Bad file descriptor
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/darts-cpu.yaml] 
-------------------------------- live log setup --------------------------------
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/apiextensions.k8s.io/v1/customresourcedefinitions "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1/profiles?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     pytest_operator.plugin:plugin.py:675 Connecting to existing model github-pr-fb181-microk8s:kubeflow on unspecified cloud
WARNING  juju.client.connection:connection.py:927 unexpected facade UserSecretsDrain received from the controller
WARNING  juju.client.connection:connection.py:927 unexpected facade UserSecretsManager received from the controller
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/darts-cpu "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/darts-cpu/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/darts-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/darts-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/darts-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/darts-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/darts-cpu "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment darts-cpu to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/darts-cpu "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment darts-cpu to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/darts-cpu "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment darts-cpu (status: 404)
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/bayesian-optimization.yaml] 
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/bayesian-optimization "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/bayesian-optimization/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/bayesian-optimization/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/bayesian-optimization/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/bayesian-optimization/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/bayesian-optimization/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/bayesian-optimization "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment bayesian-optimization to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/bayesian-optimization "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment bayesian-optimization to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/bayesian-optimization "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment bayesian-optimization (status: 404)
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/hyperband.yaml] 
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment hyperband to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment hyperband to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/hyperband "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment hyperband (status: 404)
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/median-stop.yaml] 
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/median-stop "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/median-stop/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/median-stop/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/median-stop/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/median-stop/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/median-stop/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/median-stop "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment median-stop to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/median-stop "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment median-stop to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/median-stop "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment median-stop (status: 404)
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/tfjob-mnist-with-summaries.yaml] 
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/tfjob-mnist-with-summaries "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/tfjob-mnist-with-summaries/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/tfjob-mnist-with-summaries/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/tfjob-mnist-with-summaries/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/tfjob-mnist-with-summaries/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/tfjob-mnist-with-summaries "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment tfjob-mnist-with-summaries to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/tfjob-mnist-with-summaries "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment tfjob-mnist-with-summaries to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/tfjob-mnist-with-summaries "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment tfjob-mnist-with-summaries (status: 404)
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/enas-cpu.yaml] 
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment enas-cpu to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment enas-cpu to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/enas-cpu "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment enas-cpu (status: 404)
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/cmaes.yaml] 
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment cmaes to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment cmaes to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/cmaes "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment cmaes (status: 404)
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/random.yaml] 
-------------------------------- live log call ---------------------------------
WARNING  juju.client.connection:connection.py:657 RPC: Connection closed, reconnecting
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/random "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/random/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/random/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/random/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/random/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/random/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/random "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment random to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/random "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment random to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/random "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment random (status: 404)
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/simple-pbt.yaml] 
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/simple-pbt "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/simple-pbt/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/simple-pbt/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/simple-pbt/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/simple-pbt/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/simple-pbt/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/simple-pbt "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment simple-pbt to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/simple-pbt "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment simple-pbt to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/simple-pbt "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment simple-pbt (status: 404)
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/grid-example.yaml] 
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/grid "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/grid/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/grid/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/grid/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/grid/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/grid/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/grid "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment grid to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/grid "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment grid to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/grid "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment grid (status: 404)
WARNING  juju.client.connection:connection.py:657 RPC: Connection closed, reconnecting
WARNING  juju.client.connection:connection.py:657 RPC: Connection closed, reconnecting
PASSED
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/file-metrics-collector.yaml] 
-------------------------------- live log call ---------------------------------
INFO     httpx:_client.py:1013 HTTP Request: POST https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments?fieldManager=katib-operators "HTTP/1.1 201 Created"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/file-metrics-collector "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/file-metrics-collector/status "HTTP/1.1 200 OK"
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/file-metrics-collector/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/file-metrics-collector/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Created
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/file-metrics-collector/status "HTTP/1.1 200 OK"
INFO     utils:utils.py:82 Experiment status is Running
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/file-metrics-collector "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment file-metrics-collector to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/file-metrics-collector "HTTP/1.1 200 OK"
INFO     utils:utils.py:102 Waiting for Experiment file-metrics-collector to be deleted.
INFO     httpx:_client.py:1013 HTTP Request: GET https://10.1.0.4:16443/apis/kubeflow.org/v1beta1/namespaces/test-kubeflow/experiments/file-metrics-collector "HTTP/1.1 404 Not Found"
INFO     utils:utils.py:107 Unable to get Experiment file-metrics-collector (status: 404)
PASSED
------------------------------ live log teardown -------------------------------
ERROR    juju.client.connection:connection.py:665 RPC: Automatic reconnect failed
ERROR    juju.client.connection:connection.py:665 RPC: Automatic reconnect failed
INFO     pytest_operator.plugin:plugin.py:783 Model status:
Model     Controller                Cloud/Region        Version  SLA          Timestamp
kubeflow  github-pr-fb181-microk8s  microk8s/localhost  3.4.4    unsupported  10:16:03Z
App                Version                  Status  Scale  Charm              Channel      Rev  Address         Exposed  Message
katib-controller                            active      1  katib-controller                  0  10.152.183.186  no       
katib-db           8.0.36-0ubuntu0.22.04.1  active      1  mysql-k8s          8.0/stable   153  10.152.183.35   no       
katib-db-manager                            active      1  katib-db-manager                  0  10.152.183.158  no       
katib-ui                                    active      1  katib-ui                          0  10.152.183.110  no       
kubeflow-profiles                           active      1  kubeflow-profiles  latest/edge  393  10.152.183.251  no       
Unit                  Workload  Agent  Address       Ports  Message
katib-controller/0*   active    idle   10.1.244.201         
katib-db-manager/0*   active    idle   10.1.244.202         
katib-db/0*           active    idle   10.1.244.205         Primary
katib-ui/0*           active    idle   10.1.244.203         
kubeflow-profiles/0*  active    idle   10.1.244.206         
INFO     pytest_operator.plugin:plugin.py:789 Juju error logs:
unit-katib-controller-0: 10:01:08 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:08 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:09 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:09 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:09 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:13 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:14 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:15 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:16 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:16 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-db-manager-0: 10:01:16 ERROR unit.katib-db-manager/0.juju-log Failed to handle <PebbleReadyEvent via KatibDBManagerOperator/on/katib_db_manager_pebble_ready[21]> with error: Waiting for relational-db data
unit-katib-db-manager-0: 10:01:18 ERROR unit.katib-db-manager/0.juju-log Failed to handle <ConfigChangedEvent via KatibDBManagerOperator/on/config_changed[26]> with error: Waiting for relational-db data
unit-katib-db-manager-0: 10:01:38 ERROR unit.katib-db-manager/0.juju-log relational-db:3: Failed to handle <RelationJoinedEvent via KatibDBManagerOperator/on/relational_db_relation_joined[46]> with error: Waiting for relational-db data
unit-katib-db-manager-0: 10:01:40 ERROR unit.katib-db-manager/0.juju-log relational-db:3: Failed to handle <RelationChangedEvent via KatibDBManagerOperator/on/relational_db_relation_changed[51]> with error: Waiting for relational-db data
WARNING  juju.client.connection:connection.py:657 RPC: Connection closed, reconnecting
ERROR    juju.client.connection:connection.py:665 RPC: Automatic reconnect failed
syncronize-issues-to-jira[bot] commented 4 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5972.

This message was autogenerated

DnPlas commented 4 months ago

I was able to reproduce this issue in a local machine.

The issue seems to be at teardown time. Right after the last test case is executed, the pytest-operator will show the juju status and pytest_operator.plugin:plugin.py:789 Juju error logs:, the latter will just hang w/o any automatic resolution causing the GH runner to just run for ~6h until it times out:

------------------------------ live log teardown -------------------------------
ERROR    juju.client.connection:connection.py:665 RPC: Automatic reconnect failed
ERROR    juju.client.connection:connection.py:665 RPC: Automatic reconnect failed
INFO     pytest_operator.plugin:plugin.py:783 Model status:
Model     Controller                Cloud/Region        Version  SLA          Timestamp
kubeflow  github-pr-fb181-microk8s  microk8s/localhost  3.4.4    unsupported  10:16:03Z
App                Version                  Status  Scale  Charm              Channel      Rev  Address         Exposed  Message
katib-controller                            active      1  katib-controller                  0  10.152.183.186  no       
katib-db           8.0.36-0ubuntu0.22.04.1  active      1  mysql-k8s          8.0/stable   153  10.152.183.35   no       
katib-db-manager                            active      1  katib-db-manager                  0  10.152.183.158  no       
katib-ui                                    active      1  katib-ui                          0  10.152.183.110  no       
kubeflow-profiles                           active      1  kubeflow-profiles  latest/edge  393  10.152.183.251  no       
Unit                  Workload  Agent  Address       Ports  Message
katib-controller/0*   active    idle   10.1.244.201         
katib-db-manager/0*   active    idle   10.1.244.202         
katib-db/0*           active    idle   10.1.244.205         Primary
katib-ui/0*           active    idle   10.1.244.203         
kubeflow-profiles/0*  active    idle   10.1.244.206         
INFO     pytest_operator.plugin:plugin.py:789 Juju error logs:
unit-katib-controller-0: 10:01:08 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:08 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:09 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:09 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:09 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:13 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:14 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:15 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:16 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-controller-0: 10:01:16 ERROR unit.katib-controller/0.juju-log Empty or missing data. Got: No data found in relation k8s-service-info data bag.
unit-katib-db-manager-0: 10:01:16 ERROR unit.katib-db-manager/0.juju-log Failed to handle <PebbleReadyEvent via KatibDBManagerOperator/on/katib_db_manager_pebble_ready[21]> with error: Waiting for relational-db data
unit-katib-db-manager-0: 10:01:18 ERROR unit.katib-db-manager/0.juju-log Failed to handle <ConfigChangedEvent via KatibDBManagerOperator/on/config_changed[26]> with error: Waiting for relational-db data
unit-katib-db-manager-0: 10:01:38 ERROR unit.katib-db-manager/0.juju-log relational-db:3: Failed to handle <RelationJoinedEvent via KatibDBManagerOperator/on/relational_db_relation_joined[46]> with error: Waiting for relational-db data
unit-katib-db-manager-0: 10:01:40 ERROR unit.katib-db-manager/0.juju-log relational-db:3: Failed to handle <RelationChangedEvent via KatibDBManagerOperator/on/relational_db_relation_changed[51]> with error: Waiting for relational-db data
WARNING  juju.client.connection:connection.py:657 RPC: Connection closed, reconnecting
ERROR    juju.client.connection:connection.py:665 RPC: Automatic reconnect failed

@rgildein suggested this could be an incompatibility issue between juju 3.4 (3.4.4) and python-libjuju (3.2.2, currently used in our tests). After bumping the version, I was able to run the tests:

error: Please add required database relation: eg. relational-db
unit-katib-db-manager-0: 00:57:34 ERROR unit.katib-db-manager/0.juju-log Failed to handle <ConfigChangedEvent via KatibDBManagerOperator/on/config_changed[26]> with error: Please add required database relation: eg. relational-db
unit-katib-db-manager-0: 00:57:53 ERROR unit.katib-db-manager/0.juju-log mysql:93: Failed to handle <RelationJoinedEvent via KatibDBManagerOperator/on/mysql_relation_joined[46]> with error: Please add required database relation: eg. relational-db
unit-katib-db-manager-0: 00:57:54 ERROR unit.katib-db-manager/0.juju-log mysql:93: Failed to handle <RelationChangedEvent via KatibDBManagerOperator/on/mysql_relation_changed[51]> with error: Please add required database relation: eg. relational-db
unit-katib-db-manager-0: 00:57:55 ERROR unit.katib-db-manager/0.juju-log mysql:93: Missing attribute 'root_password' in mysql relation data
unit-katib-db-manager-0: 00:57:55 ERROR unit.katib-db-manager/0.juju-log mysql:93: Failed to handle <RelationChangedEvent via KatibDBManagerOperator/on/mysql_relation_changed[56]> with error: Incorrect/incomplete data found in relation mysql. See logs

WARNING  juju.client.connection:connection.py:654 RPC: Connection closed, reconnecting
ERROR    juju.client.connection:connection.py:662 RPC: Automatic reconnect failed
INFO     pytest_operator.plugin:plugin.py:947 Forgetting model main...
INFO     httpx:_client.py:1013 HTTP Request: DELETE https://172.31.15.25:16443/apis/kubeflow.org/v1/profiles/test-kubeflow "HTTP/1.1 200 OK"

================================================================================ warnings summary ================================================================================
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/grid-example.yaml]
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/grid-example.yaml]
tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/grid-example.yaml]
  /home/ubuntu/katib-operators/.tox/bundle-integration/lib/python3.10/site-packages/juju/client/connection.py:659: DeprecationWarning: The explicit passing of coroutine objects to asyncio.wait() is deprecated since Python 3.8, and scheduled for removal in Python 3.11.
    await jasyncio.wait([self.reconnect()])

tests/integration/test_katib_experiments.py::test_katib_experiments[tests/assets/crs/experiments/grid-example.yaml]
  /home/ubuntu/katib-operators/.tox/bundle-integration/lib/python3.10/site-packages/_pytest/runner.py:140: RuntimeWarning: coroutine 'training_operator' was never awaited
    item.funcargs = None  # type: ignore[attr-defined]
  Enable tracemalloc to get traceback where the object was allocated.
  See https://docs.pytest.org/en/stable/how-to/capture-warnings.html#resource-warnings for more info.

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=================================================================== 12 passed, 4 warnings in 596.21s (0:09:56) ===================================================================
  bundle-integration: OK (610.19=setup[13.07]+cmd[597.12] seconds)
  congratulations :) (610.25 seconds)

210 should fix this issue