canonical / oidc-gatekeeper-operator

Charmed OIDC Gatekeeper
Apache License 2.0
1 stars 7 forks source link

oidc-gatekeeper-operator charm error after upgrade from kubeflow 1.7 to kubeflow 1.8 #139

Closed dnegreira closed 6 months ago

dnegreira commented 6 months ago

Bug Description

When following the guide on the migration of kubeflow 1.7 to 1.8, the migration is finalized but there is an issue with the oidc-gatekeeper-operator charm is currently leaving the unit in error state.

To Reproduce

Follow the guide on a microk8s deployment and run a migration from juju 2.9 to 3.4.

Environment

Juju/Kubeflow on microk8s.

Relevant Log Output

ubuntu@kubeflow17:~$ juju debug-log --include oidc-gatekeeper/0
unit-oidc-gatekeeper-0: 10:30:09 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-oidc-gatekeeper-0: 10:33:49 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-oidc-gatekeeper-0: 10:33:49 ERROR unit.oidc-gatekeeper/0.juju-log ingress:15: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/pebble.py", line 1593, in _request_raw
    response = self.opener.open(request, timeout=self.timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 500: Internal Server Error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./src/charm.py", line 208, in <module>
    main(OIDCGatekeeperOperator)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/framework.py", line 344, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/framework.py", line 841, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/framework.py", line 930, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 64, in main
    update_layer(self._container_name, self._container, self._oidc_layer, self.logger)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/charmed_kubeflow_chisme/pebble/_update_layer.py", line 25, in update_layer
    current_layer = container.get_plan()
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/model.py", line 1972, in get_plan
    return self._pebble.get_plan()
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/pebble.py", line 1881, in get_plan
    resp = self._request('GET', '/v1/plan', {'format': 'yaml'})
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/pebble.py", line 1560, in _request
    response = self._request_raw(method, path, query, headers, data)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/pebble.py", line 1604, in _request_raw
    raise APIError(body, code, status, message)
ops.pebble.APIError: cannot parse layer "oidc-authservice": yaml: unmarshal errors:
  line 10: field working-dir not found in type plan.Service
unit-oidc-gatekeeper-0: 10:33:49 ERROR juju.worker.uniter.operation hook "ingress-relation-changed" (via hook dispatching script: dispatch) failed: exit status 1
unit-oidc-gatekeeper-0: 10:33:49 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-oidc-gatekeeper-0: 10:36:00 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-oidc-gatekeeper-0: 10:38:49 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-oidc-gatekeeper-0: 10:38:50 ERROR unit.oidc-gatekeeper/0.juju-log ingress:15: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/pebble.py", line 1593, in _request_raw
    response = self.opener.open(request, timeout=self.timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 500: Internal Server Error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./src/charm.py", line 208, in <module>
    main(OIDCGatekeeperOperator)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/framework.py", line 344, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/framework.py", line 841, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/framework.py", line 930, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 64, in main
    update_layer(self._container_name, self._container, self._oidc_layer, self.logger)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/charmed_kubeflow_chisme/pebble/_update_layer.py", line 25, in update_layer
    current_layer = container.get_plan()
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/model.py", line 1972, in get_plan
    return self._pebble.get_plan()
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/pebble.py", line 1881, in get_plan
    resp = self._request('GET', '/v1/plan', {'format': 'yaml'})
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/pebble.py", line 1560, in _request
    response = self._request_raw(method, path, query, headers, data)
  File "/var/lib/juju/agents/unit-oidc-gatekeeper-0/charm/venv/ops/pebble.py", line 1604, in _request_raw
    raise APIError(body, code, status, message)
ops.pebble.APIError: cannot parse layer "oidc-authservice": yaml: unmarshal errors:
  line 10: field working-dir not found in type plan.Service
unit-oidc-gatekeeper-0: 10:38:50 ERROR juju.worker.uniter.operation hook "ingress-relation-changed" (via hook dispatching script: dispatch) failed: exit status 1
unit-oidc-gatekeeper-0: 10:38:50 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook

Additional Context

No response

syncronize-issues-to-jira[bot] commented 6 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5457.

This message was autogenerated

DnPlas commented 6 months ago

Thanks for reporting this issue @dnegreira !

I tried reproducing it by deploying CKF 1.7 and upgrading to 1.8 following the upgrade guide. Is there an interim step you made for upgrading oidc? Could you also share the output of juju status oidc-gatekeeper? Are you upgrading to 1.8/stable?

dnegreira commented 6 months ago

Hi @DnPlas

juju status:

ubuntu@kubeflow17:~$ juju status oidc-gatekeeper
Model     Controller  Cloud/Region      Version  SLA          Timestamp
kubeflow  uk8s34      my-k8s/localhost  2.9.46   unsupported  13:27:46Z

App              Version                Status   Scale  Charm            Channel         Rev  Address         Exposed  Message
oidc-gatekeeper  res:oci-image@7aae6d7  waiting      1  oidc-gatekeeper  ckf-1.8/stable  350  10.152.183.192  no       waiting for units to settle down

Unit                Workload  Agent  Address     Ports  Message
oidc-gatekeeper/0*  error     idle   10.1.79.14         hook failed: "ingress-relation-changed"

The only interim step that I took are the ones in the guide: https://charmed-kubeflow.io/docs/upgrade-17-18#heading--upgrade-podspec-to-sidecar-charms

Are you upgrading to 1.8/stable?

Yes.

DnPlas commented 6 months ago

@dnegreira It seems like the version of juju you are using is still 2.9.x which may have a pebble version that is not yet compatible with the working-dir field in the pebble layer. Could you please try migrating the agent version to 3.x and trying the refresh?

You could use this migration guide

dnegreira commented 6 months ago

hi @DnPlas

so I was missing the model-upgrade on the migration guide, which actually upgrades the model to the version of the new controller (which was already on the 3.1.7 version), and this issue is now solved.

DnPlas commented 6 months ago

Closing this issue as the error is related to an incompatible juju controller. Feel free to re-open if you run into the same issue.