Open StefanSa opened 4 months ago
You should also apply the CRD manifest: https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/crd.yaml
@brandond Hi Brad, thanks for the hint, that at least fixed this error. But now i don't get any active jobs displayed and there are still these error messages in the pod.
kubectl get jobs -n system-upgrade
No resources found in system-upgrade namespace.
E0308 08:11:43.584498 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
2024-03-08T08:12:36.630633052Z E0308 08:12:36.630500 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
2024-03-08T08:13:27.715830430Z E0308 08:13:27.715712 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
E0308 08:14:25.235647 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
E0308 08:15:21.877204 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
E0308 08:16:20.665483 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
E0308 08:16:57.158737 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
E0308 08:17:27.865155 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
E0308 08:18:03.792967 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
E0308 08:18:41.345443 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
E0308 08:19:23.898774 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
E0308 08:20:14.628402 1 reflector.go:138] k8s.io/client-go@v1.21.14-k3s1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:system-upgrade:system-upgrade" cannot list resource "secrets" in API group "" in the namespace "system-upgrade"
Do you see the system-upgrade-controller
role in the namespace?
@SISheogorath
There are two cluster roles here.
One is system-upgrade-controller
and the other is system-upgrade-controller-drainer
.
@SISheogorath @brandond Error found. The role were incomplete. The authorization to read secrets and all rights for the jobs were missing. With this role it works without a problem:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: >
{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"name":"system-upgrade-controller"},"rules":[{"apiGroups":["batch"],"resources":["jobs"],"verbs":["get","list","watch"]},{"apiGroups":[""],"resources":["namespaces","nodes"],"verbs":["get","list","watch"]},{"apiGroups":[""],"resources":["nodes"],"verbs":["update"]},{"apiGroups":["upgrade.cattle.io"],"resources":["plans","plans/status"],"verbs":["get","list","watch","create","patch","update","delete"]}]}
objectset.rio.cattle.io/applied: >-
H4sIAAAAAAAA/6yQQa/TMBCE/8uenbykTvyaXDlw58AFvcPa3ry4de3IXhdB1f+OHCohUQQXTtaMdj6P5ga4uc+UsosBZkgaTYuF15jcd2QXQ3s+5tbFl2sPAs4uWJjhgy+ZKX2KnkDAhRgtMsJ8Awwh8p7LVUZ9IsOZuE0utgaZPVWYqxRCjaqbpgbHQTXDKJdGv8qhIbsMh15bPZoR7gI8avJ/xa2YV5hBvSqlOnWUYyclyYPuD91x0kpKuwzGDLKfrFULVmjAC8EM+VtmujRle09oqTExcIreU6o3qXjKMH/ZN/qYYtmqAo1sVngTkCjHksx+A6eoczWvlPRuvBODAO9yfb7+zNzFb6wnTO2VN6xKQIiW/gv0iVM2i0x/yD6W+DXuE2zzGGq5/X3JjFz+2VGASVQ/FLA9jEcDAZY87VXe7j8CAAD///Ycri6NAgAA
objectset.rio.cattle.io/id: eaba6099-a546-453f-b734-edf421bdb5c5
creationTimestamp: '2024-03-07T10:42:28Z'
finalizers:
- wrangler.cattle.io/auth-prov-v2-crole
labels:
objectset.rio.cattle.io/hash: 6766606835033e32b12089b633df4cc4319dd6fa
managedFields:
- apiVersion: rbac.authorization.k8s.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:kubectl.kubernetes.io/last-applied-configuration: {}
manager: kubectl-client-side-apply
operation: Update
time: '2024-03-07T10:42:28Z'
- apiVersion: rbac.authorization.k8s.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:objectset.rio.cattle.io/applied: {}
f:objectset.rio.cattle.io/id: {}
f:finalizers:
.: {}
v:"wrangler.cattle.io/auth-prov-v2-crole": {}
f:labels:
.: {}
f:objectset.rio.cattle.io/hash: {}
f:rules: {}
manager: rancher
operation: Update
time: '2024-03-08T11:50:34Z'
name: system-upgrade-controller
resourceVersion: '23869128'
uid: d3802361-ab53-4cbe-92a9-aaffcd61f325
rules:
- apiGroups:
- batch
resources:
- jobs
verbs:
- get
- list
- watch
- create
- delete
- patch
- update
- apiGroups:
- ''
resources:
- namespaces
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ''
resources:
- nodes
verbs:
- update
- apiGroups:
- upgrade.cattle.io
resources:
- plans
- plans/status
verbs:
- get
- list
- watch
- create
- patch
- update
- delete
- apiGroups:
- ''
resources:
- secrets
verbs:
- list
There should be two clusterroles and one role. When I adjusted the roles for the controller, I decided to limit secrets and job creation to the namespace of the controller.
Maybe this was too restrictive. I just double checked my setup, the controller is functional here with these roles.
The reason I asked for its existence is that it might be related to object ordering: https://github.com/rancher/system-upgrade-controller/pull/296
Here there are only clusterroles and no role
watch missing also on secret
Failed to watch *v1.Secret: unknown (get secrets.meta.k8s.io)
If you apply the release manifest a second time (now that the namespace exists), does it fix the issue?
all objects unchanged.
kubectl apply -f https://raw.githubusercontent.com/rancher/system-upgrade-controller/v0.13.4/manifests/system-upgrade-controller.yaml
namespace/system-upgrade unchanged
serviceaccount/system-upgrade unchanged
configmap/default-controller-env unchanged
deployment.apps/system-upgrade-controller unchanged
If you use the manifests directory from the tag in the git repository, you have to apply all manifests.
The release manifest I referred to is attached to the Release on GitHub: https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/system-upgrade-controller.yaml
That looks good.
kubectl apply -f https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/system-upgrade-controller.yaml
clusterrole.rbac.authorization.k8s.io/system-upgrade-controller configured
role.rbac.authorization.k8s.io/system-upgrade-controller created
clusterrole.rbac.authorization.k8s.io/system-upgrade-controller-drainer unchanged
clusterrolebinding.rbac.authorization.k8s.io/system-upgrade-drainer unchanged
clusterrolebinding.rbac.authorization.k8s.io/system-upgrade unchanged
rolebinding.rbac.authorization.k8s.io/system-upgrade created
namespace/system-upgrade unchanged
serviceaccount/system-upgrade unchanged
configmap/default-controller-env unchanged
deployment.apps/system-upgrade-controller configured
And no more error messages, a role has also been created.
1 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
time="2024-03-08T13:00:42Z" level=info msg="No access to list CRDs, assuming CRDs are pre-created."
2024-03-08T13:00:43.076970550Z time="2024-03-08T13:00:43Z" level=info msg="Starting /v1, Kind=Node controller"
2024-03-08T13:00:43.077002976Z time="2024-03-08T13:00:43Z" level=info msg="Starting /v1, Kind=Secret controller"
2024-03-08T13:00:43.099580197Z time="2024-03-08T13:00:43Z" level=info msg="Starting batch/v1, Kind=Job controller"
2024-03-08T13:00:43.146037841Z time="2024-03-08T13:00:43Z" level=info msg="Starting upgrade.cattle.io/v1, Kind=Plan controller"
Well, now we know, that #296 actually fixed a problem 🙌🏻
The permissions are not correct, the upgrade pod spews errors that it is not allowed to delete pods. I tried completely removing all SUC resources (including CRDs) and reinstalling from scratch and the issue persists. Going back to version 0.13.2 makes it work. There is definitely something wrong in the roles or cluster roles. Didn't dig deeper yet.
Version v0.13.4
Platform/Architecture openSUSE MicroOS 20240221
Describe the bug When i create a new plan, i get this error message:
To Reproduce
kubectl label node master-01 master-02 worker-01 worker-02 worker-03 k3s-upgrade=true
kubectl apply -f k3s-upgrade.yaml
k3s-upgrade.yaml:
Expected behavior Upgrade plan without error message.
Actual behavior Error message:
no matches for kind "Plan" in version "upgrade.cattle.io/v1"
Additional context log in pod: