vexxhost / atmosphere

Simple & easy private cloud platform featuring VMs, Kubernetes & bare-metal
100 stars 28 forks source link

Magnum 401 Unauthorized error trying to create cluster #400

Closed iPenguin closed 1 year ago

iPenguin commented 1 year ago

Hi,

I'm trying to create a cluster with 1 master node and 1 worker node. In the UI an error is returned immediately. "Error: Unable to create cluster."

When we look at the magnum logs we get the following error message:

❯ kubectl -n openstack logs -l application=magnum,component=cluster-api-proxy
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/pykube/query.py", line 196, in __iter__
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task     return iter(self.query_cache["objects"])
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/pykube/query.py", line 186, in query_cache
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task     cache["response"] = self.execute().json()
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/pykube/query.py", line 161, in execute
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task     r.raise_for_status()
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task     raise HTTPError(http_error_msg, response=self)
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://172.28.0.1:443/apis/infrastructure.cluster.x-k8s.io/v1alpha6/namespaces/magnum-system/openstackclusters
2023-05-02 17:59:11.596 1 ERROR oslo_service.periodic_task

Do you have any ideas why we are getting this error?

Thanks, Brian

okozachenko1203 commented 1 year ago

@iPenguin Can you share the following results?

kubectl get ClusterRoleBinding magnum-cluster-api -o yaml
kubectl get -n openstack ds magnum-cluster-api-proxy -o jsonpath='{.spec.template.spec.serviceAccountName}'
iPenguin commented 1 year ago

Hi Oleksandr,

Thanks for getting back to me. Here is the output you requested:

❯ kubectl get ClusterRoleBinding magnum-cluster-api -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  creationTimestamp: "2023-05-01T18:57:00Z"
  name: magnum-cluster-api
  resourceVersion: "39919409"
  uid: 73deb8f7-16fe-415f-aa7e-e22ce1e5aea6
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: magnum-conductor
  namespace: openstack
❯ kubectl get -n openstack ds magnum-cluster-api-proxy -o jsonpath='{.spec.template.spec.serviceAccountName}'
magnum-conductor
okozachenko1203 commented 1 year ago

Ok, the role binding is correct for cluster-api-proxy. Could you share the full traceback log of that error you shared partially in the issue description so I can get where this error was raised? Thanks.

iPenguin commented 1 year ago
2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task [None req-29f673be-4d46-4b51-909f-4270d77c6b86 - - - - - -] Error during ProxyManager.sync: requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://172.28.0.1:443/apis/infrastructure.cluster.x-k8s.io/v1alpha6/namespaces/magnum-system/openstackclusters |  
-- | -- | --
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task Traceback (most recent call last): |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/oslo_service/periodic_task.py", line 216, in run_periodic_tasks |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task     task(self, context) |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/magnum_cluster_api/proxy/manager.py", line 287, in sync |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task     for cluster in clusters: |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/pykube/query.py", line 196, in __iter__ |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task     return iter(self.query_cache["objects"]) |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/pykube/query.py", line 186, in query_cache |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task     cache["response"] = self.execute().json() |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/pykube/query.py", line 161, in execute |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task     r.raise_for_status() |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task     raise HTTPError(http_error_msg, response=self) |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://172.28.0.1:443/apis/infrastructure.cluster.x-k8s.io/v1alpha6/namespaces/magnum-system/openstackclusters |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.000 1 ERROR oslo_service.periodic_task |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task [None req-38e3d92f-ac27-4453-b71b-708d0194326c - - - - - -] Error during ProxyManager.sync: requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://172.28.0.1:443/apis/infrastructure.cluster.x-k8s.io/v1alpha6/namespaces/magnum-system/openstackclusters |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task Traceback (most recent call last): |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/oslo_service/periodic_task.py", line 216, in run_periodic_tasks |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task     task(self, context) |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/magnum_cluster_api/proxy/manager.py", line 287, in sync |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task     for cluster in clusters: |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/pykube/query.py", line 196, in __iter__ |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task     return iter(self.query_cache["objects"]) |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/pykube/query.py", line 186, in query_cache |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task     cache["response"] = self.execute().json() |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/pykube/query.py", line 161, in execute |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task     r.raise_for_status() |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task   File "/var/lib/openstack/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task     raise HTTPError(http_error_msg, response=self) |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://172.28.0.1:443/apis/infrastructure.cluster.x-k8s.io/v1alpha6/namespaces/magnum-system/openstackclusters |  
  |   | 2023-05-02 17:59:09 | 2023-05-02 21:59:09.348 1 ERROR oslo_service.periodic_task
mnaser commented 1 year ago

hrm, can you check if there is a rolebinding and/or role called magnum-cluster-api in the openstack namespace? if so, could you remove that and try again?

runlevel-six commented 1 year ago

Not with the name magnum-cluster-api but here is a list of all rolebinding and role in the openstack namespace that have magnum in the name:

❯ kubectl -n openstack get rolebinding,role | grep magnum
rolebinding.rbac.authorization.k8s.io/magnum-magnum-api                      Role/magnum-openstack-magnum-api                      3d19h
rolebinding.rbac.authorization.k8s.io/magnum-magnum-conductor                Role/magnum-openstack-magnum-conductor                3d19h
rolebinding.rbac.authorization.k8s.io/magnum-magnum-db-init                  Role/magnum-openstack-magnum-db-init                  3d19h
rolebinding.rbac.authorization.k8s.io/magnum-magnum-db-sync                  Role/magnum-openstack-magnum-db-sync                  3d19h
rolebinding.rbac.authorization.k8s.io/magnum-magnum-ks-endpoints             Role/magnum-openstack-magnum-ks-endpoints             3d19h
rolebinding.rbac.authorization.k8s.io/magnum-magnum-ks-service               Role/magnum-openstack-magnum-ks-service               3d19h
rolebinding.rbac.authorization.k8s.io/magnum-magnum-ks-user                  Role/magnum-openstack-magnum-ks-user                  3d19h
rolebinding.rbac.authorization.k8s.io/magnum-magnum-ks-user-domain           Role/magnum-openstack-magnum-ks-user-domain           3d19h
rolebinding.rbac.authorization.k8s.io/magnum-magnum-rabbit-init              Role/magnum-openstack-magnum-rabbit-init              3d19h
rolebinding.rbac.authorization.k8s.io/rabbitmq-magnum-server                 Role/rabbitmq-magnum-peer-discovery                   35d
role.rbac.authorization.k8s.io/magnum-openstack-magnum-api                      2023-05-01T19:29:40Z
role.rbac.authorization.k8s.io/magnum-openstack-magnum-conductor                2023-05-01T19:29:40Z
role.rbac.authorization.k8s.io/magnum-openstack-magnum-db-init                  2023-05-01T19:29:40Z
role.rbac.authorization.k8s.io/magnum-openstack-magnum-db-sync                  2023-05-01T19:29:40Z
role.rbac.authorization.k8s.io/magnum-openstack-magnum-ks-endpoints             2023-05-01T19:29:40Z
role.rbac.authorization.k8s.io/magnum-openstack-magnum-ks-service               2023-05-01T19:29:40Z
role.rbac.authorization.k8s.io/magnum-openstack-magnum-ks-user                  2023-05-01T19:29:40Z
role.rbac.authorization.k8s.io/magnum-openstack-magnum-ks-user-domain           2023-05-01T19:29:40Z
role.rbac.authorization.k8s.io/magnum-openstack-magnum-rabbit-init              2023-05-01T19:29:40Z
role.rbac.authorization.k8s.io/rabbitmq-magnum-peer-discovery                   2023-03-30T15:17:30Z
mnaser commented 1 year ago

@runlevel-six @iPenguin wild idea, can you try rolling out the ds? i wonder if it's using a cached service account and its not loading a new client.

iPenguin commented 1 year ago

Hi @mnaser,

We have tried uninstalling the Magnum helm chart and redeploying, however I went ahead and restarted the ds and I'm still getting the same error when I try to create a cluster.

Thanks, Brian

okozachenko1203 commented 1 year ago

Hi @iPenguin, as i requested in zendesk, could you allow us to access your environment and i can help you to troubleshoot more.

mnaser commented 1 year ago

moved this to Atmosphere, the old role and role binding that live inside of magnum-system namespace are not removed.

iPenguin commented 1 year ago

Hi @mnaser,

Thanks again for all your help today!

-Brian

mnaser commented 1 year ago

I believe the way to resolve this inside Atmosphere is to add a task which removes the magnum-cluster-api Role and RoleBinding which are in the magnum-system namespace which used to be created before which were changed here:

https://github.com/vexxhost/atmosphere/commit/158823643ded6811eba712e293c104d798b7e599