canonical / dex-auth-operator

Operator for Dex Auth
Apache License 2.0
3 stars 14 forks source link

Dex-auth units in error state due to oidc-client relation broken #53

Closed lferran closed 2 years ago

lferran commented 2 years ago

Summary

After installing the kubeflow-lite bundle on MicroK8s, dex-auth units end up in an error state due to broken relation with oidc-client.

dex-auth units show the following traceback

Reproduction Steps

Install microK8s following this guide. Then deploy the kubeflow bundle following this guide.

Versions:

snap list | grep -E '(juju|microk8s)'
juju               2.9.28                      18717  latest/stable    canonical*        classic
juju-crashdump     1.0.2+git100.fed9b56        258    latest/stable    jason-hobbs       classic
juju-kubectl       0.1.0                       15     latest/stable    kennethkoski      classic
juju-wait          2.8.4~2.8.4                 96     latest/stable    stub              classic
microk8s           v1.21.11                    3058   1.21/stable      canonical*        classic

Host OS is:

lsb_release -a                       
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.4 LTS
Release:    20.04
Codename:   focal

Introspection reports

Microk8s inspection report Juju crashdump

mateoflorido commented 2 years ago

I had the same problem. However the steps to replicate it are different in my case.

Reproduction Steps:

DnPlas commented 2 years ago

Hi @mateoflorido , could you please share the logs of both the pod (dex-auth) and the juju debug-log output? Also, do you see an error related to a missing relation or is it something different?

dsupru commented 2 years ago

I am experiencing the same issue as @mateoflorido : I run ubuntu in Vagrant (with a VirtualBox provider); cluster comes up fine on the first boot, but if I reboot the VM I get that dex-auth issue

Please find more information below @DnPlas:

Running microk8s kubectl logs dex-auth-operator-0 -n kubeflow

2022-05-26 17:08:07 ERROR juju-log oidc-client:2: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 1284, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/var/lib/juju/tools/unit-dex-auth-5/relation-get', '-r', '2', '-', '', '--app', '--format=json')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./src/charm.py", line 209, in <module>
    main(Operator)
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/main.py", line 394, in main
    charm = charm_class(framework)
  File "./src/charm.py", line 44, in __init__
    self.interfaces = get_interfaces(self)
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/serialized_data_interface/__init__.py", line 263, in get_interfaces
    requires = {
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/serialized_data_interface/__init__.py", line 264, in <dictcomp>
    name: SerializedDataInterface(
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/serialized_data_interface/__init__.py", line 110, in __init__
    others = {
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/serialized_data_interface/__init__.py", line 111, in <dictcomp>
    app.name: bag.get("_supported_versions")
  File "/usr/lib/python3.8/_collections_abc.py", line 660, in get
    return self[key]
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 400, in __getitem__
    return self._data[key]
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 384, in _data
    data = self._lazy_data = self._load()
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 748, in _load
    return self._backend.relation_get(self.relation.id, self._entity.name, self._is_app)
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 1351, in relation_get
    return self._run(*args, return_output=True, use_json=True)
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 1286, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: b'ERROR "" is not a valid unit or application\n'
2022-05-26 17:08:07 ERROR juju.worker.caasoperator.uniter.dex-auth/5.operation runhook.go:146 hook "oidc-client-relation-broken" (via hook dispatching script: dispatch) failed: exit status 1
2022-05-26 17:08:07 INFO juju.worker.caasoperator.uniter.dex-auth/5 resolver.go:150 awaiting error resolution for "relation-broken" hook 

Pod dex-auth seem to be running fine: Output from microk8s kubectl logs dex-auth-69474d4bc-kztv4 -n kubeflow:

time="2022-05-26T16:42:39Z" level=info msg="config issuer: http://10.64.140.43.nip.io/dex"
time="2022-05-26T16:42:39Z" level=info msg="kubernetes client apiVersion = dex.coreos.com/v1"
time="2022-05-26T16:42:39Z" level=info msg="creating custom Kubernetes resources"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource authcodes.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource authcodes.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource authrequests.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource authrequests.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource oauth2clients.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource oauth2clients.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource signingkeies.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource signingkeies.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource refreshtokens.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource refreshtokens.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource passwords.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource passwords.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource offlinesessionses.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource offlinesessionses.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource connectors.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource connectors.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource devicerequests.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource devicerequests.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="checking if custom resource devicetokens.dex.coreos.com has been created already..."
time="2022-05-26T16:42:39Z" level=info msg="The custom resource devicetokens.dex.coreos.com already available, skipping create"
time="2022-05-26T16:42:39Z" level=info msg="config storage: kubernetes"
time="2022-05-26T16:42:39Z" level=info msg="config static client: Ambassador Auth OIDC"
time="2022-05-26T16:42:39Z" level=info msg="config connector: local passwords enabled"
time="2022-05-26T16:42:39Z" level=info msg="config skipping approval screen"
time="2022-05-26T16:42:39Z" level=info msg="listening (http) on 0.0.0.0:5556"

And juju debug-log:

application-dex-auth: 17:18:08 INFO juju.worker.caasoperator.uniter.dex-auth/5 awaiting error resolution for "relation-broken" hook
application-dex-auth: 17:18:09 ERROR unit.dex-auth/5.juju-log oidc-client:2: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 1284, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/var/lib/juju/tools/unit-dex-auth-5/relation-get', '-r', '2', '-', '', '--app', '--format=json')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./src/charm.py", line 209, in <module>
    main(Operator)
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/main.py", line 394, in main
    charm = charm_class(framework)
  File "./src/charm.py", line 44, in __init__
    self.interfaces = get_interfaces(self)
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/serialized_data_interface/__init__.py", line 263, in get_interfaces
    requires = {
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/serialized_data_interface/__init__.py", line 264, in <dictcomp>
    name: SerializedDataInterface(
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/serialized_data_interface/__init__.py", line 110, in __init__
    others = {
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/serialized_data_interface/__init__.py", line 111, in <dictcomp>
    app.name: bag.get("_supported_versions")
  File "/usr/lib/python3.8/_collections_abc.py", line 660, in get
    return self[key]
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 400, in __getitem__
    return self._data[key]
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 384, in _data
    data = self._lazy_data = self._load()
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 748, in _load
    return self._backend.relation_get(self.relation.id, self._entity.name, self._is_app)
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 1351, in relation_get
    return self._run(*args, return_output=True, use_json=True)
  File "/var/lib/juju/agents/unit-dex-auth-5/charm/venv/ops/model.py", line 1286, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: b'ERROR "" is not a valid unit or application\n'
application-dex-auth: 17:18:09 ERROR juju.worker.caasoperator.uniter.dex-auth/5.operation hook "oidc-client-relation-broken" (via hook dispatching script: dispatch) failed: exit status 1

Hope that helps.

Edit: This too is relation-broken error, below is a screenshot of juju status --color

Screen Shot 2022-05-26 at 1 28 39 PM
natalian98 commented 2 years ago

Fixed by https://github.com/canonical/dex-auth-operator/pull/62