Closed fryz closed 11 months ago
@reasonerjt - I'm wondering if you have any thoughts/ideas on this issue.
I've done some more digging, and I haven't been able to make Velero work without granting list
permissions on resources I wish to backup at cluster scope.
It appears that Velero requires the service-account to have permissions to list ALL resources at cluster-scope, even if it's configured to only backup resources in a single namespace.
Is there some way around this?
I first tried to see if I could control the backup and prevent it from accessing resources that might be at cluster scope, or resources which it might not have permissions to list/get. For both of these cases below, I continued to get the permission error listed in the original description (eg: error getting ClusterRoleBindings at cluster-scope)
--include-cluster-resources=false
to stop the backup from accessing cluster-scoped resources--include-namespaces
to only include the single namespace I want to backup (this is the same namespace where velero is installed)Then, on a dev environment (where I have permissions to create ClusterRoleBindings) I tried to see if I could proceed with the backup by creating a ClusterRoleBinding that only allowed for list
ing CRBs. I was able to proceed, but got new failures now that looked similar:
time="2022-08-08T20:07:28Z" level=error msg="Error listing resources" backup=app/backup-test1-2022-08-08 error="pods is forbidden: User \"system:serviceaccount:app:velero-server\" cannot list resource \"pods\" in API group \"\" at the cluster scope" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/backup/item_collector.go:476" error.function="github.com/vmware-tanzu/velero/pkg/backup.(*itemCollector).
Any thoughts?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Closing the stale issue.
@reasonerjt Has there been any development on this? I'm noticing a similar issue trying to use velero v1.10.0 in Azure
duplicate of https://github.com/vmware-tanzu/velero/issues/18
When #18 is closed is when implementation can begin and then it'll be released.
Hi - Can this JIRA be re-opened?
I think this is related to k8s ServiceAccount resource. There is a Velero plugin applies to ServiceAccount. The plugin will go through all ClusterRoleBinding and ClusterRole, then return the related ClusterRoleBinding and ClusterRole to the ServiceAccount as additional backup items.
If it's possible, please also exclude the ServiceAccount resource from the backup.
That is a bit hacky solution, and there may also be other resources that need cluster-scope resources access permission.
@fryz Could you give some details about your environment? Velero is designed to work with the cluster administrator's permission. Is it possible to let Velero have the administrator's permission, and only back resources in specific namespace?
@fryz One thing that confused me about your scenario is that Role and RoleBinding are used to set permission for the Velero server, and the Role also includes quite some cluster-scoped resources in its permission. Please be aware that Role can only used to grant permission to namespace-scoped resources.
I also saw that ClusterRole and ClusterRoleBinding are also included in the Role permissions, so you are just trying to back namespaced-scope resource up, right?
Hi @blackpiglet, I work closely with @fryz and can try to provide some context on our environment. We are deploying an application in an enterprise Kubernetes environment in which we only have access to a single namespace as the application vendor. The enterprise controls do not allow us to use any ClusterRoles or ClusterRoleBindings, especially ClusterAdministrator. We are hoping to be able to use Velero in a namespace-scoped fashion so we can backup and restore our application.
Thanks for the tip about excluding the service accounts. I will try to get a test setup in the next few days to try your suggestion. @fryz is currently out but should be back next week as well with more information.
Thanks for the help!
I did some tests to install Velero without any cluster-scoped resource permissions, but the Velero server failed to start. The failure is:
time="2023-10-08T09:10:44Z" level=info msg="Checking existence of namespace." logSource="pkg/cmd/server/server.go:445" namespace=velero
An error occurred: namespaces "velero" is forbidden: User "system:serviceaccount:velero:velero-server" cannot get resource "namespaces" in API group "" in the namespace "velero"
I think this is due to the Velero server needing to confirm the namespace it runs in exists before spinning up the controllers.
It seems that error didn't happen in your cluster. Could you help to confirm your environment's Velero permission setting? It looks like it's inevitable to be involved with some cluster-scoped resource access.
Hey @blackpiglet - thanks for the work and attempts to help us get this working.
Quick note - when we originally ran into this issue, we were using Velero 1.10 and the 2.32 version of the helm chart. It looks like things have changed a bit (esp. on the helm chart) since that release, so the specific errors seen above might only impact the version that we're deploying.
My plan was to take the first step and confirm if this works on the latest version using your workaround re: ServiceAccounts above (sounds like it doesn't work based on your latest comment?). Then, if it doesn't work, I was going to try on the versions that we are using to see if the workaround works there.
Hey @blackpiglet
Just quickly wanted to let you know that when I tried installing/configuring using the setup above only using Role/RoleBindings (on the 1.10 version of velero with the 2.32 version of the helm-chart), I ran into issues with Velero during server initialization because it's looking for the Velero Custom Resources at cluster scope:
time="2023-10-09T23:56:44Z" level=error msg="failed to list backups" error="backups.velero.io is forbidden: User \"system:serviceaccount:arthur:velero-server\
" cannot list resource \"backups\" in API group \"velero.io\" at the cluster scope" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/cmd/server/server.g
o:954" error.function=github.com/vmware-tanzu/velero/pkg/cmd/server.markInProgressBackupsFailed logSource="pkg/cmd/server/server.go:954"
So I think from our end, I wasn't able to get to the point where I could take a backup without granting the velero-server ServiceAccount cluster-scoped permissions on the Velero Cluster Resources.
@fryz Thanks for reporting this issue. The code does look a little different there.
I think this piece of code is used to mark the InProgress backups as failed during the Velero server start. Could you try to delete the InProgress backups before starting the Velero server? This is a temporary workaround. It can make the progress go further to see whether there are other obstacles. I will try to resolve this issue in the main branch.
@fryz I tested with the PR. The error of failing to read the Velero CRs from the cluster scope is gone, but when creating a backup, there are still many errors due to no permission to read k8s resources.
Errors:
Velero: error: /backendconfigs.cloud.google.com is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "backendconfigs" in API group "cloud.google.com" in the namespace "default"
error: /capacityrequests.internal.autoscaling.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "capacityrequests" in API group "internal.autoscaling.gke.io" in the namespace "default"
error: /managedcertificates.networking.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "managedcertificates" in API group "networking.gke.io" in the namespace "default"
error: /serviceattachments.networking.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "serviceattachments" in API group "networking.gke.io" in the namespace "default"
error: /servicenetworkendpointgroups.networking.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "servicenetworkendpointgroups" in API group "networking.gke.io" in the namespace "default"
error: /frontendconfigs.networking.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "frontendconfigs" in API group "networking.gke.io" in the namespace "default"
error: /volumesnapshots.snapshot.storage.k8s.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "volumesnapshots" in API group "snapshot.storage.k8s.io" in the namespace "default"
error: /deletebackuprequests.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "deletebackuprequests" in API group "velero.io" in the namespace "default"
error: /backups.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "backups" in API group "velero.io" in the namespace "default"
error: /downloadrequests.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "downloadrequests" in API group "velero.io" in the namespace "default"
error: /volumesnapshotlocations.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "volumesnapshotlocations" in API group "velero.io" in the namespace "default"
error: /serverstatusrequests.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "serverstatusrequests" in API group "velero.io" in the namespace "default"
error: /backuprepositories.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "backuprepositories" in API group "velero.io" in the namespace "default"
error: /restores.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "restores" in API group "velero.io" in the namespace "default"
error: /backupstoragelocations.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "backupstoragelocations" in API group "velero.io" in the namespace "default"
error: /podvolumebackups.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "podvolumebackups" in API group "velero.io" in the namespace "default"
error: /schedules.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "schedules" in API group "velero.io" in the namespace "default"
error: /podvolumerestores.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "podvolumerestores" in API group "velero.io" in the namespace "default"
error: /datadownloads.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "datadownloads" in API group "velero.io" in the namespace "default"
error: /datauploads.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "datauploads" in API group "velero.io" in the namespace "default"
error: /updateinfos.nodemanagement.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "updateinfos" in API group "nodemanagement.gke.io" in the namespace "default"
Cluster: <none>
Namespaces:
default: resource: /backendconfigs error: /backendconfigs.cloud.google.com is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "backendconfigs" in API group "cloud.google.com" in the namespace "default"
resource: /capacityrequests error: /capacityrequests.internal.autoscaling.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "capacityrequests" in API group "internal.autoscaling.gke.io" in the namespace "default"
resource: /managedcertificates error: /managedcertificates.networking.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "managedcertificates" in API group "networking.gke.io" in the namespace "default"
resource: /serviceattachments error: /serviceattachments.networking.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "serviceattachments" in API group "networking.gke.io" in the namespace "default"
resource: /servicenetworkendpointgroups error: /servicenetworkendpointgroups.networking.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "servicenetworkendpointgroups" in API group "networking.gke.io" in the namespace "default"
resource: /frontendconfigs error: /frontendconfigs.networking.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "frontendconfigs" in API group "networking.gke.io" in the namespace "default"
resource: /volumesnapshots error: /volumesnapshots.snapshot.storage.k8s.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "volumesnapshots" in API group "snapshot.storage.k8s.io" in the namespace "default"
resource: /deletebackuprequests error: /deletebackuprequests.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "deletebackuprequests" in API group "velero.io" in the namespace "default"
resource: /backups error: /backups.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "backups" in API group "velero.io" in the namespace "default"
resource: /downloadrequests error: /downloadrequests.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "downloadrequests" in API group "velero.io" in the namespace "default"
resource: /volumesnapshotlocations error: /volumesnapshotlocations.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "volumesnapshotlocations" in API group "velero.io" in the namespace "default"
resource: /serverstatusrequests error: /serverstatusrequests.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "serverstatusrequests" in API group "velero.io" in the namespace "default"
resource: /backuprepositories error: /backuprepositories.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "backuprepositories" in API group "velero.io" in the namespace "default"
resource: /restores error: /restores.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "restores" in API group "velero.io" in the namespace "default"
resource: /backupstoragelocations error: /backupstoragelocations.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "backupstoragelocations" in API group "velero.io" in the namespace "default"
resource: /podvolumebackups error: /podvolumebackups.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "podvolumebackups" in API group "velero.io" in the namespace "default"
resource: /schedules error: /schedules.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "schedules" in API group "velero.io" in the namespace "default"
resource: /podvolumerestores error: /podvolumerestores.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "podvolumerestores" in API group "velero.io" in the namespace "default"
resource: /datadownloads error: /datadownloads.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "datadownloads" in API group "velero.io" in the namespace "default"
resource: /datauploads error: /datauploads.velero.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "datauploads" in API group "velero.io" in the namespace "default"
resource: /updateinfos error: /updateinfos.nodemanagement.gke.io is forbidden: User "system:serviceaccount:velero:velero-server" cannot list resource "updateinfos" in API group "nodemanagement.gke.io" in the namespace "default"
What steps did you take and what happened:
Due to restrictions on the k8s environment I am running in, I cannot use ClusterRoleBindings because it grants cluster scope. To address this, I have installed Velero (using the helm chart, values.yaml file provided below) with the following configuration:
rbac.clusterAdministrator
, and therefore, the ClusterRoleBinding is not generated and not appliedapp
), and we expect to only backup resources within that namespace (eg: not CRDs, PVs, etc.)With the aforementioned configuration, I try to take a backup:
And I get the following in the backup-controller logs:
With the specific failure being:
Even when I attempt to exclude CRBs as a resource from the backup, I get the same error:
What did you expect to happen:
Backup's should not require cluster-scope to backup resources when disabled, or I should be able to exclude resources that require cluster-scope from the backup to get the backup to succeed.
The following information will help us better understand what's going on:
I'm not going to attach the debug output because it contains the logs and state of all pods/resources in the cluster. Let me know if you need specific information and I can provide.
Anything else you would like to add:
Helm Values File:
Environment:
velero version
): v1.9.0velero client config get features
): NOT SETkubectl version
): Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.3", GitCommit:"aef86a93758dc3cb2c658dd9657ab4ad4afc21cb", GitTreeState:"clean", BuildDate:"2022-07-13T14:21:56Z", GoVersion:"go1.18.4", Compiler:"gc", Platform:"darwin/arm64"} Kustomize Version: v4.5.4 Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.12-eks-a64ea69", GitCommit:"d4336843ba36120e9ed1491fddff5f2fec33eb77", GitTreeState:"clean", BuildDate:"2022-05-12T18:29:27Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}/etc/os-release
):Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.