Closed jguipi closed 9 months ago
Ended up founding another way to do backup, and now I'm unable to create a successfull backup due to volume in-use
:
command : velero backup create test5 --include-namespaces=default
time="2024-02-18T05:02:15Z" level=info msg="Waiting for volumesnapshotcontents snapcontent-70c18c78-6e80-443d-a967-ab24bc97c370 to have snapshot handle. Retrying in 5s" backup=velero/test5 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:259" pluginName=velero-plugin-for-csi
time="2024-02-18T05:02:15Z" level=warning msg="Volumesnapshotcontent snapcontent-70c18c78-6e80-443d-a967-ab24bc97c370 has error: Failed to check and update snapshot content: failed to take snapshot of the volume c91638ec-6362-4e15-b3cd-e2f957e3e6b5: \"rpc error: code = Internal desc = CreateSnapshot failed with error Bad request with: [POST https://volume-3.eu-nl-1.cloud.sap:443/v3/ddcde3b2cae24ea0a85ad5b608b7ac97/snapshots], error message: {\\\"badRequest\\\": {\\\"code\\\": 400, \\\"message\\\": \\\"Invalid volume: Volume c91638ec-6362-4e15-b3cd-e2f957e3e6b5 status must be available, but current status is: in-use.\\\"}}\"" backup=velero/test5 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:261" pluginName=velero-plugin-for-csi
time="2024-02-18T05:02:20Z" level=info msg="Waiting for volumesnapshotcontents snapcontent-70c18c78-6e80-443d-a967-ab24bc97c370 to have snapshot handle. Retrying in 5s" backup=velero/test5 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:259" pluginName=velero-plugin-for-csi
time="2024-02-18T05:02:20Z" level=warning msg="Volumesnapshotcontent snapcontent-70c18c78-6e80-443d-a967-ab24bc97c370 has error: Failed to check and update snapshot content: failed to take snapshot of the volume c91638ec-6362-4e15-b3cd-e2f957e3e6b5: \"rpc error: code = Internal desc = CreateSnapshot failed with error Bad request with: [POST https://volume-3.eu-nl-1.cloud.sap:443/v3/ddcde3b2cae24ea0a85ad5b608b7ac97/snapshots], error message: {\\\"badRequest\\\": {\\\"code\\\": 400, \\\"message\\\": \\\"Invalid volume: Volume c91638ec-6362-4e15-b3cd-e2f957e3e6b5 status must be available, but current status is: in-use.\\\"}}\"" backup=velero/test5 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:261" pluginName=velero-plugin-for-csi
Hello,
Thank you for opening issue.
Regarding the first error, you need to follow documentation for swift container setup to make plugin to successfully create temporary URL:
time="2024-02-18T03:28:42Z" level=warning msg="fail to get Backup metadata file's download URL {BackupResults test4}, retry later: rpc error: code = Unknown desc = failed to create temporary URL for \"backups/test4/test4-results.gz\" object in \"velero-backups-can\" container: Unable to obtain the Temp URL key." controller=download-request downloadRequest=velero/test4-4036bc95-faac-472e-ac17-f16b167c98cc logSource="pkg/controller/download_request_controller.go:206"
Those swift
commands don't need to be executed from inside a container. You can run them locally, but you need to download your Openstack/Swift credentials, source them and then run the commands. Of course you need to install python-swiftclient
using pip
to run the swift
command.
For volume in-use the snapshot should work, because it's doing --force
as mentioned in docs:
The snapshots are done using flag
--force
. The reason is that volumes in state in-use cannot be snapshotted without it (they would need to be detached in advance). In some cases this can make snapshot contents inconsistent!
Can you write me definition of your BackupStorageLocation? From log message you pasted it seems like you are doing snapshot using CSI driver, not this plugin.
backup=velero/test5 cmd=/plugins/velero-plugin-for-csi
logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:261"
pluginName=velero-plugin-for-csi
Thank you for the fast response.
here's my full value.yaml file that I applied :
##
## Configuration settings related to Velero installation namespace
##
# Labels settings in namespace
namespace:
labels: {}
# Enforce Pod Security Standards with Namespace Labels
# https://kubernetes.io/docs/tasks/configure-pod-container/enforce-standards-namespace-labels/
# - key: pod-security.kubernetes.io/enforce
# value: privileged
# - key: pod-security.kubernetes.io/enforce-version
# value: latest
# - key: pod-security.kubernetes.io/audit
# value: privileged
# - key: pod-security.kubernetes.io/audit-version
# value: latest
# - key: pod-security.kubernetes.io/warn
# value: privileged
# - key: pod-security.kubernetes.io/warn-version
# value: latest
##
## End of namespace-related settings.
##
##
## Configuration settings that directly affect the Velero deployment YAML.
##
# Details of the container image to use in the Velero deployment & daemonset (if
# enabling node-agent). Required.
image:
repository: velero/velero
tag: v1.13.0
# Digest value example: sha256:d238835e151cec91c6a811fe3a89a66d3231d9f64d09e5f3c49552672d271f38.
# If used, it will take precedence over the image.tag.
# digest:
pullPolicy: IfNotPresent
# One or more secrets to be used when pulling images
imagePullSecrets: []
# - registrySecretName
nameOverride: ""
fullnameOverride: ""
# Annotations to add to the Velero deployment's. Optional.
#
# If you are using reloader use the following annotation with your VELERO_SECRET_NAME
annotations: {}
# secret.reloader.stakater.com/reload: "<VELERO_SECRET_NAME>"
# Annotations to add to secret
secretAnnotations: {}
# Labels to add to the Velero deployment's. Optional.
labels: {}
# Annotations to add to the Velero deployment's pod template. Optional.
#
# If using kube2iam or kiam, use the following annotation with your AWS_ACCOUNT_ID
# and VELERO_ROLE_NAME filled in:
podAnnotations: {}
# iam.amazonaws.com/role: "arn:aws:iam::<AWS_ACCOUNT_ID>:role/<VELERO_ROLE_NAME>"
# Additional pod labels for Velero deployment's template. Optional
# ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
podLabels: {}
# Number of old history to retain to allow rollback (If not set, default Kubernetes value is set to 10)
# revisionHistoryLimit: 1
# Resource requests/limits to specify for the Velero deployment.
# https://velero.io/docs/v1.6/customize-installation/#customize-resource-requests-and-limits
resources:
requests:
cpu: 500m
memory: 128Mi
limits:
cpu: 1000m
memory: 512Mi
# Resource requests/limits to specify for the upgradeCRDs job pod. Need to be adjusted by user accordingly.
upgradeJobResources: {}
# requests:
# cpu: 50m
# memory: 128Mi
# limits:
# cpu: 100m
# memory: 256Mi
upgradeCRDsJob:
# Extra volumes for the Upgrade CRDs Job. Optional.
extraVolumes: []
# Extra volumeMounts for the Upgrade CRDs Job. Optional.
extraVolumeMounts: []
# Extra key/value pairs to be used as environment variables. Optional.
extraEnvVars: {}
# Configure the dnsPolicy of the Velero deployment
# See: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
dnsPolicy: ClusterFirst
# Init containers to add to the Velero deployment's pod spec. At least one plugin provider image is required.
# If the value is a string then it is evaluated as a template.
initContainers:
- name: velero-plugin-for-csi
image: velero/velero-plugin-for-csi:v0.7.0
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /target
name: plugins
- name: velero-plugin-for-openstack
image: lirt/velero-plugin-for-openstack:v0.6.0
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /target
name: plugins
# SecurityContext to use for the Velero deployment. Optional.
# Set fsGroup for `AWS IAM Roles for Service Accounts`
# see more informations at: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
podSecurityContext: {}
# fsGroup: 1337
# Container Level Security Context for the 'velero' container of the Velero deployment. Optional.
# See: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container
containerSecurityContext: {}
# allowPrivilegeEscalation: false
# capabilities:
# drop: ["ALL"]
# add: []
# readOnlyRootFilesystem: true
# Container Lifecycle Hooks to use for the Velero deployment. Optional.
lifecycle: {}
# Pod priority class name to use for the Velero deployment. Optional.
priorityClassName: ""
# The number of seconds to allow for graceful termination of the pod. Optional.
terminationGracePeriodSeconds: 3600
# Liveness probe of the pod
livenessProbe:
httpGet:
path: /metrics
port: http-monitoring
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
# Readiness probe of the pod
readinessProbe:
httpGet:
path: /metrics
port: http-monitoring
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
# Tolerations to use for the Velero deployment. Optional.
tolerations: []
# Affinity to use for the Velero deployment. Optional.
affinity: {}
# Node selector to use for the Velero deployment. Optional.
nodeSelector: {}
# DNS configuration to use for the Velero deployment. Optional.
dnsConfig: {}
# Extra volumes for the Velero deployment. Optional.
extraVolumes:
- name: cloud-config-velero
secret:
secretName: velero-credentials
items:
- key: clouds.yaml
path: clouds.yaml
# Extra volumeMounts for the Velero deployment. Optional.
extraVolumeMounts:
- name: cloud-config-velero
mountPath: /etc/openstack/clouds.yaml
readOnly: true
subPath: clouds.yaml
# Extra K8s manifests to deploy
extraObjects: []
# - apiVersion: secrets-store.csi.x-k8s.io/v1
# kind: SecretProviderClass
# metadata:
# name: velero-secrets-store
# spec:
# provider: aws
# parameters:
# objects: |
# - objectName: "velero"
# objectType: "secretsmanager"
# jmesPath:
# - path: "access_key"
# objectAlias: "access_key"
# - path: "secret_key"
# objectAlias: "secret_key"
# - path: "region"
# objectAlias: "region"
# secretObjects:
# - data:
# - key: access_key
# objectName: client-id
# - key: client-secret
# objectName: client-secret
# - key: region
# objectName: client-secret
# secretName: velero-secrets-store
# type: Opaque
# Settings for Velero's prometheus metrics. Enabled by default.
metrics:
enabled: true
scrapeInterval: 30s
scrapeTimeout: 10s
# service metdata if metrics are enabled
service:
annotations: {}
labels: {}
# Pod annotations for Prometheus
podAnnotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8085"
prometheus.io/path: "/metrics"
serviceMonitor:
autodetect: true
enabled: false
annotations: {}
additionalLabels: {}
# metrics.serviceMonitor.metricRelabelings Specify Metric Relabelings to add to the scrape endpoint
# ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#relabelconfig
# metricRelabelings: []
# metrics.serviceMonitor.relabelings [array] Prometheus relabeling rules
# relabelings: []
# ServiceMonitor namespace. Default to Velero namespace.
# namespace:
# ServiceMonitor connection scheme. Defaults to HTTP.
# scheme: ""
# ServiceMonitor connection tlsConfig. Defaults to {}.
# tlsConfig: {}
nodeAgentPodMonitor:
autodetect: true
enabled: false
annotations: {}
additionalLabels: {}
# ServiceMonitor namespace. Default to Velero namespace.
# namespace:
# ServiceMonitor connection scheme. Defaults to HTTP.
# scheme: ""
# ServiceMonitor connection tlsConfig. Defaults to {}.
# tlsConfig: {}
prometheusRule:
autodetect: true
enabled: false
# Additional labels to add to deployed PrometheusRule
additionalLabels: {}
# PrometheusRule namespace. Defaults to Velero namespace.
# namespace: ""
# Rules to be deployed
spec: []
# - alert: VeleroBackupPartialFailures
# annotations:
# message: Velero backup {{ $labels.schedule }} has {{ $value | humanizePercentage }} partialy failed backups.
# expr: |-
# velero_backup_partial_failure_total{schedule!=""} / velero_backup_attempt_total{schedule!=""} > 0.25
# for: 15m
# labels:
# severity: warning
# - alert: VeleroBackupFailures
# annotations:
# message: Velero backup {{ $labels.schedule }} has {{ $value | humanizePercentage }} failed backups.
# expr: |-
# velero_backup_failure_total{schedule!=""} / velero_backup_attempt_total{schedule!=""} > 0.25
# for: 15m
# labels:
# severity: warning
kubectl:
image:
repository: docker.io/bitnami/kubectl
# Digest value example: sha256:d238835e151cec91c6a811fe3a89a66d3231d9f64d09e5f3c49552672d271f38.
# If used, it will take precedence over the kubectl.image.tag.
# digest:
# kubectl image tag. If used, it will take precedence over the cluster Kubernetes version.
# tag: 1.16.15
# Container Level Security Context for the 'kubectl' container of the crd jobs. Optional.
# See: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container
containerSecurityContext: {}
# Resource requests/limits to specify for the upgrade/cleanup job. Optional
resources: {}
# Annotations to set for the upgrade/cleanup job. Optional.
annotations: {}
# Labels to set for the upgrade/cleanup job. Optional.
labels: {}
# This job upgrades the CRDs.
upgradeCRDs: true
# This job is meant primarily for cleaning up CRDs on CI systems.
# Using this on production systems, especially those that have multiple releases of Velero, will be destructive.
cleanUpCRDs: false
##
## End of deployment-related settings.
##
##
## Parameters for the `default` BackupStorageLocation and VolumeSnapshotLocation,
## and additional server settings.
##
configuration:
# Parameters for the BackupStorageLocation(s). Configure multiple by adding other element(s) to the backupStorageLocation slice.
# See https://velero.io/docs/v1.6/api-types/backupstoragelocation/
backupStorageLocation:
# name is the name of the backup storage location where backups should be stored. If a name is not provided,
# a backup storage location will be created with the name "default". Optional.
# provider is the name for the backup storage location provider.
- name:
provider: community.openstack.org/openstack
# bucket is the name of the bucket to store backups in. Required.
bucket: velero-backups-can
# caCert defines a base64 encoded CA bundle to use when verifying TLS connections to the provider. Optional.
caCert:
# prefix is the directory under which all Velero data should be stored within the bucket. Optional.
prefix:
# default indicates this location is the default backup storage location. Optional.
default: true
# validationFrequency defines how frequently Velero should validate the object storage. Optional.
validationFrequency:
# accessMode determines if velero can write to this backup storage location. Optional.
# default to ReadWrite, ReadOnly is used during migrations and restores.
accessMode: ReadWrite
credential:
# name of the secret used by this backupStorageLocation.
# name: velero-credentials
# name of key that contains the secret data to be used.
# key: clouds
# auth_url: "https://identity.com/v3"
# Additional provider-specific configuration. See link above
# for details of required/optional fields for your provider.
config:
cloud: main
auth_url: "https://identity.com/v3"
region: "eu-nl-1"
# resticRepoPrefix: swift:containers:/restic
# domain-name: "cp"
# tenant-name: "gardener-canary-team"
# s3ForcePathStyle:
# s3Url:
# kmsKeyId:
# resourceGroup:
# The ID of the subscription containing the storage account, if different from the cluster’s subscription. (Azure only)
# subscriptionId:
# storageAccount:
# publicUrl:
# Name of the GCP service account to use for this backup storage location. Specify the
# service account here if you want to use workload identity instead of providing the key file.(GCP only)
# serviceAccount:
# Option to skip certificate validation or not if insecureSkipTLSVerify is set to be true, the client side should set the
# flag. For Velero client Command like velero backup describe, velero backup logs needs to add the flag --insecure-skip-tls-verify
# insecureSkipTLSVerify:
# Parameters for the VolumeSnapshotLocation(s). Configure multiple by adding other element(s) to the volumeSnapshotLocation slice.
# See https://velero.io/docs/v1.6/api-types/volumesnapshotlocation/
volumeSnapshotLocation:
# name is the name of the volume snapshot location where snapshots are being taken. Required.
- name: cinder
# optional snapshot method:
# * "snapshot" is a default cinder snapshot method
# * "clone" is for a full volume clone instead of a snapshot allowing the
# source volume to be deleted
# * "backup" is for a full volume backup uploaded to a Cinder backup
# allowing the source volume to be deleted (EXPERIMENTAL)
# * "image" is for a full volume backup uploaded to a Glance image
# allowing the source volume to be deleted (EXPERIMENTAL)
# requires the "enable_force_upload" Cinder option to be enabled on the server
method: snapshot
volumeTimeout: 5m
snapshotTimeout: 5m
cloneTimeout: 5m
backupTimeout: 5m
imageTimeout: 5m
# provider is the name for the volume snapshot provider.
provider: community.openstack.org/openstack-cinder
credential:
# name of the secret used by this volumeSnapshotLocation.
name: velero-credentials2
# name of key that contains the secret data to be used.
key: clouds
# Additional provider-specific configuration. See link above
# for details of required/optional fields for your provider.
config:
auth_url: "https://identity.com/v3"
region: eu-nl-1
domain-name: "cp"
tenant-name: "gardener-canary-team"
# apiTimeout:
# resourceGroup:
# The ID of the subscription where volume snapshots should be stored, if different from the cluster’s subscription. If specified, also requires `configuration.volumeSnapshotLocation.config.resourceGroup`to be set. (Azure only)
# subscriptionId:
# incremental:
# snapshotLocation:
# project:
- name: manila
provider: community.openstack.org/openstack-manila
credential:
# name of the secret used by this volumeSnapshotLocation.
name: velero-credentials2
# name of key that contains the secret data to be used.
key: clouds
# Additional provider-specific configuration. See link above
# for details of required/optional fields for your provider.
config:
# optional snapshot method:
# * "snapshot" is a default manila snapshot method
# * "clone" is for a full share clone instead of a snapshot allowing the
# source share to be deleted
method: snapshot
region: eu-nl-1
auth_url: "https://identity.com/v3"
domain-name: "cp"
tenant-name: "gardener-canary-team"
# optional Manila CSI driver name (default: nfs.manila.csi.openstack.org)
driver: ceph.manila.csi.openstack.org
# optional resource readiness timeouts in Golang time format: https://pkg.go.dev/time#ParseDuration
# (default: 5m)
shareTimeout: 5m
snapshotTimeout: 5m
cloneTimeout: 5m
replicaTimeout: 5m
# ensures that the Manila share/snapshot/replica is removed
# this is a workaround to the https://bugs.launchpad.net/manila/+bug/2025641 and
# https://bugs.launchpad.net/manila/+bug/1960239 bugs
# if the share/snapshot/replica is in "error_deleting" status, the plugin will try
# to reset its status (usually extra admin permissions are required) and delete it
# again within the defined "cloneTimeout", "snapshotTimeout" or "replicaTimeout"
ensureDeleted: "true"
# a delay to wait between delete/reset actions when "ensureDeleted" is enabled
ensureDeletedDelay: 10s
# deletes all dependent share resources (i.e. snapshots, replicas) before deleting
# the clone share (works only, when a snapshot method is set to clone)
cascadeDelete: "true"
# enforces availability zone checks when the availability zone of a
# snapshot/share differs from the Velero metadata
enforceAZ: "true"
# These are server-level settings passed as CLI flags to the `velero server` command. Velero
# uses default values if they're not passed in, so they only need to be explicitly specified
# here if using a non-default value. The `velero server` default values are shown in the
# comments below.
# --------------------
# `velero server` default: restic
uploaderType:
# `velero server` default: 1m
backupSyncPeriod:
# `velero server` default: 4h
fsBackupTimeout:
# `velero server` default: 30
clientBurst:
# `velero server` default: 500
clientPageSize:
# `velero server` default: 20.0
clientQPS:
# Name of the default backup storage location. Default: default
defaultBackupStorageLocation:
# How long to wait by default before backups can be garbage collected. Default: 72h
defaultBackupTTL:
# Name of the default volume snapshot location.
defaultVolumeSnapshotLocations:
# `velero server` default: empty
disableControllers:
# `velero server` default: 1h
garbageCollectionFrequency:
# Set log-format for Velero pod. Default: text. Other option: json.
logFormat:
# Set log-level for Velero pod. Default: info. Other options: debug, warning, error, fatal, panic.
logLevel:
# The address to expose prometheus metrics. Default: :8085
metricsAddress:
# Directory containing Velero plugins. Default: /plugins
pluginDir:
# The address to expose the pprof profiler. Default: localhost:6060
profilerAddress:
# `velero server` default: false
restoreOnlyMode:
# `velero server` default: customresourcedefinitions,namespaces,storageclasses,volumesnapshotclass.snapshot.storage.k8s.io,volumesnapshotcontents.snapshot.storage.k8s.io,volumesnapshots.snapshot.storage.k8s.io,persistentvolumes,persistentvolumeclaims,secrets,configmaps,serviceaccounts,limitranges,pods,replicasets.apps,clusterclasses.cluster.x-k8s.io,clusters.cluster.x-k8s.io,clusterresourcesets.addons.cluster.x-k8s.io
restoreResourcePriorities:
# `velero server` default: 1m
storeValidationFrequency:
# How long to wait on persistent volumes and namespaces to terminate during a restore before timing out. Default: 10m
terminatingResourceTimeout:
# Bool flag to configure Velero server to move data by default for all snapshots supporting data movement. Default: false
defaultSnapshotMoveData:
# Comma separated list of velero feature flags. default: empty
# features: EnableCSI
features: EnableCSI
# `velero server` default: velero
namespace:
# additional key/value pairs to be used as environment variables such as "AWS_CLUSTER_NAME: 'yourcluster.domain.tld'"
extraEnvVars: {}
# Set true for backup all pod volumes without having to apply annotation on the pod when used file system backup Default: false.
defaultVolumesToFsBackup:
# How often repository maintain is run for repositories by default.
defaultRepoMaintainFrequency:
##
## End of backup/snapshot location settings.
##
##
## Settings for additional Velero resources.
##
rbac:
# Whether to create the Velero role and role binding to give all permissions to the namespace to Velero.
create: true
# Whether to create the cluster role binding to give administrator permissions to Velero
clusterAdministrator: true
# Name of the ClusterRole.
clusterAdministratorName: cluster-admin
# Information about the Kubernetes service account Velero uses.
serviceAccount:
server:
create: true
name:
annotations:
labels:
# Info about the secret to be used by the Velero deployment, which
# should contain credentials for the cloud provider IAM account you've
# set up for Velero.
credentials:
# Whether a secret should be used. Set to false if, for examples:
# - using kube2iam or kiam to provide AWS IAM credentials instead of providing the key file. (AWS only)
# - using workload identity instead of providing the key file. (Azure/GCP only)
useSecret: true
# Name of the secret to create if `useSecret` is true and `existingSecret` is empty
name:
# Name of a pre-existing secret (if any) in the Velero namespace
# that should be used to get IAM account credentials. Optional.
existingSecret: velero-credentials
# Data to be stored in the Velero secret, if `useSecret` is true and `existingSecret` is empty.
# As of the current Velero release, Velero only uses one secret key/value at a time.
# The key must be named `cloud`, and the value corresponds to the entire content of your IAM credentials file.
# Note that the format will be different for different providers, please check their documentation.
# Here is a list of documentation for plugins maintained by the Velero team:
# [AWS] https://github.com/vmware-tanzu/velero-plugin-for-aws/blob/main/README.md
# [GCP] https://github.com/vmware-tanzu/velero-plugin-for-gcp/blob/main/README.md
# [Azure] https://github.com/vmware-tanzu/velero-plugin-for-microsoft-azure/blob/main/README.md
secretContents:
# additional key/value pairs to be used as environment variables such as "DIGITALOCEAN_TOKEN: <your-key>". Values will be stored in the secret.
extraEnvVars: {}
# Name of a pre-existing secret (if any) in the Velero namespace
# that will be used to load environment variables into velero and node-agent.
# Secret should be in format - https://kubernetes.io/docs/concepts/configuration/secret/#use-case-as-container-environment-variables
extraSecretRef: ""
# Whether to create backupstoragelocation crd, if false => do not create a default backup location
backupsEnabled: true
# Whether to create volumesnapshotlocation crd, if false => disable snapshot feature
snapshotsEnabled: true
# Whether to deploy the node-agent daemonset.
deployNodeAgent: true
nodeAgent:
podVolumePath: /var/lib/kubelet/pods
privileged: false
# Pod priority class name to use for the node-agent daemonset. Optional.
priorityClassName: ""
# Resource requests/limits to specify for the node-agent daemonset deployment. Optional.
# https://velero.io/docs/v1.6/customize-installation/#customize-resource-requests-and-limits
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1000m
memory: 1024Mi
# Tolerations to use for the node-agent daemonset. Optional.
tolerations: []
# Annotations to set for the node-agent daemonset. Optional.
annotations: {}
# labels to set for the node-agent daemonset. Optional.
labels: {}
# will map /scratch to emptyDir. Set to false and specify your own volume
# via extraVolumes and extraVolumeMounts that maps to /scratch
# if you don't want to use emptyDir.
useScratchEmptyDir: true
# Extra volumes for the node-agent daemonset. Optional.
extraVolumes: []
# Extra volumeMounts for the node-agent daemonset. Optional.
extraVolumeMounts: []
# Key/value pairs to be used as environment variables for the node-agent daemonset. Optional.
extraEnvVars: {}
# Configure the dnsPolicy of the node-agent daemonset
# See: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
dnsPolicy: ClusterFirst
# SecurityContext to use for the Velero deployment. Optional.
# Set fsGroup for `AWS IAM Roles for Service Accounts`
# see more informations at: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
podSecurityContext:
runAsUser: 0
# fsGroup: 1337
# Container Level Security Context for the 'node-agent' container of the node-agent daemonset. Optional.
# See: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container
containerSecurityContext: {}
# Container Lifecycle Hooks to use for the node-agent daemonset. Optional.
lifecycle: {}
# Node selector to use for the node-agent daemonset. Optional.
nodeSelector: {}
# Affinity to use with node-agent daemonset. Optional.
affinity: {}
# DNS configuration to use for the node-agent daemonset. Optional.
dnsConfig: {}
# Backup schedules to create.
# Eg:
# schedules:
# mybackup:
# disabled: false
# labels:
# myenv: foo
# annotations:
# myenv: foo
# schedule: "0 0 * * *"
# useOwnerReferencesInBackup: false
# template:
# ttl: "240h"
# storageLocation: default
# includedNamespaces:
# - foo
schedules: {}
# Velero ConfigMaps.
# Eg:
# configMaps:
# See: https://velero.io/docs/v1.11/file-system-backup/
# fs-restore-action-config:
# labels:
# velero.io/plugin-config: ""
# velero.io/pod-volume-restore: RestoreItemAction
# data:
# image: velero/velero-restore-helper:v1.10.2
# cpuRequest: 200m
# memRequest: 128Mi
# cpuLimit: 200m
# memLimit: 128Mi
# secCtx: |
# capabilities:
# drop:
# - ALL
# add: []
# allowPrivilegeEscalation: false
# readOnlyRootFilesystem: true
# runAsUser: 1001
# runAsGroup: 999
configMaps: {}
##
## End of additional Velero resource settings.
##
install command
helm install velero vmware-tanzu/velero -f values.yaml
Extra information:
Hello,
The config looks correct. You need to first apply the TMP URL configuration. Please reach back to me after you do it to check if the issue is solved.
Secondary the Backup or Schedule has to reference storageLocation: default
(or your name of BackupStorageLocation).
You might need to upgrade your swift command:
This one is returning an error on MacOS
SWIFT_TMP_URL_KEY=$(dd if=/dev/urandom | tr -dc A-Za-z0-9 | head -c 40)
tr: Illegal byte sequence
An alternative that I found was
SWIFT_TMP_URL_KEY=$(dd if=/dev/urandom | LC_ALL=C tr -dc A-Za-z0-9 | head -c 40)
I ran the command to do the Swift Container Setup
and it went well.
I setup the BackupStorageLocation
like you mentionned, but now it seems like the error that I'm getting comes from the backuprepositories.velero.io
logs from Velero pod
time="2024-02-19T15:18:45Z" level=info msg="pod default/my-pod has volumes to backup: [my-volume]" backup=velero/test223 logSource="pkg/podvolume/backupper.go:172" name=my-pod namespace=default resource=pods
time="2024-02-19T15:18:45Z" level=info msg="No repository found, creating one" backupLocation=default logSource="pkg/repository/ensurer.go:89" repositoryType=restic volumeNamespace=default
time="2024-02-19T15:18:45Z" level=info msg="Initializing backup repository" backupRepo=velero/default-default-restic-wlv7b logSource="pkg/controller/backup_repository_controller.go:216"
time="2024-02-19T15:18:45Z" level=info msg="Set matainenance according to repository suggestion" frequency=168h0m0s logSource="pkg/controller/backup_repository_controller.go:263"
time="2024-02-19T15:18:45Z" level=error msg="Restic command fail with ExitCode: 1. Process ID is 297, Exit error is: exit status 1" logSource="pkg/util/exec/exec.go:66"
time="2024-02-19T15:18:45Z" level=error msg="Error checking repository for stale locks" backupRepo=velero/default-default-restic-wlv7b error="error running command=restic unlock --repo= --password-file=/tmp/credentials/velero/velero-repo-credentials-repository-password --cache-dir=/scratch/.cache/restic, stdout=, stderr=Fatal: Please specify repository location (-r or --repository-file)\n: exit status 1" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/repository/restic/repository.go:123" error.function="github.com/vmware-tanzu/velero/pkg/repository/restic.(*RepositoryService).exec" logSource="pkg/controller/backup_repository_controller.go:184"
time="2024-02-19T15:18:45Z" level=info msg="Checking backup repository for readiness" backupRepo=velero/default-default-restic-wlv7b logSource="pkg/controller/backup_repository_controller.go:307"
time="2024-02-19T15:18:46Z" level=info msg="1 errors encountered backup up item" backup=velero/test223 logSource="pkg/backup/backup.go:457" name=my-pod
time="2024-02-19T15:18:46Z" level=error msg="Error backing up item" backup=velero/test223 error="failed to wait BackupRepository: backup repository is not ready: error to get identifier for repo default-default-restic-wlv7b: invalid backend type community.openstack.org/openstack, provider community.openstack.org/openstack" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/repository/backup_repo_op.go:83" error.function=github.com/vmware-tanzu/velero/pkg/repository.GetBackupRepository logSource="pkg/backup/backup.go:461" name=my-pod
Thank you again for your help !
Regarding this error it looks like Velero code only allows certain providers to do filesystem backup. It's configured here in the code. For this I have to redirect you to Velero FSB documentation as restic
or other file-system-backup providers are not a topic of this plugin.
time="2024-02-19T15:18:46Z"
level=error
msg="Error backing up item" backup=velero/test223
error="failed to wait BackupRepository: backup repository is not ready:
error to get identifier for repo default-default-restic-wlv7b:
invalid backend type community.openstack.org/openstack,
provider community.openstack.org/openstack"
error.file="/go/src/github.com/vmware-tanzu/velero/pkg/repository/backup_repo_op.go:83"
error.function=github.com/vmware-tanzu/velero/pkg/repository.GetBackupRepository
logSource="pkg/backup/backup.go:461" name=my-pod
Besides that there is a restic error that you need to solve but I'm not sure if solving this can fix the error above.
time="2024-02-19T15:18:45Z"
level=error
msg="Error checking repository for stale locks"
backupRepo=velero/default-default-restic-wlv7b
error="error running
command=restic unlock \
--repo= \
--password-file=/tmp/credentials/velero/velero-repo-credentials-repository-password \
--cache-dir=/scratch/.cache/restic,
stdout=,
stderr=Fatal: Please specify repository location (-r or --repository-file)\n:
exit status 1"
error.file="/go/src/github.com/vmware-tanzu/velero/pkg/repository/restic/repository.go:123"
error.function="github.com/vmware-tanzu/velero/pkg/repository/restic.(*RepositoryService).exec" logSource="pkg/controller/backup_repository_controller.go:184"
If you don't want to do file-system-backup, you can disable restic and snapshots will work fine.
looks like Velero code only allows certain providers to do filesystem backup
Nice catch.
Unfortunately, even when I disable restic
and run a normal velero backup create test3 --include-namespaces=default
I'm getting the issues that I had previously mentioning that the volume is in use.
time="2024-02-19T17:22:39Z" level=info msg="Waiting for volumesnapshotcontents snapcontent-c9c3c980-d38b-46c5-91ea-be3976aa035b to have snapshot handle. Retrying in 5s" backup=velero/test3 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:259" pluginName=velero-plugin-for-csi
time="2024-02-19T17:22:39Z" level=warning msg="Volumesnapshotcontent snapcontent-c9c3c980-d38b-46c5-91ea-be3976aa035b has error: Failed to check and update snapshot content: failed to take snapshot of the volume c91638ec-6362-4e15-b3cd-e2f957e3e6b5: \"rpc error: code = Internal desc = CreateSnapshot failed with error Bad request with: [POST https://volume-3.eu-nl-1.cloud.sap:443/v3/ddcde3b2cae24ea0a85ad5b608b7ac97/snapshots], error message: {\\\"badRequest\\\": {\\\"code\\\": 400, \\\"message\\\": \\\"Invalid volume: Volume c91638ec-6362-4e15-b3cd-e2f957e3e6b5 status must be available, but current status is: in-use.\\\"}}\"" backup=velero/test3 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:261" pluginName=velero-plugin-for-csi
Having previously achieved success backing up everything in AWS S3, I think returning to that option would provide ample support for various backup requirements.
Thank you for your assistance. If no other solutions address the ongoing "in use" issue, I suggest closing the ticket.
:smile: This is still not a log of this plugin - you can see this in log pluginName=velero-plugin-for-csi
. Please read error logs carefully or you will be deceiving yourself!
In a simple standard setup you should have 1 BSL and 1 VSL.
---
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: my-awesome-bsl
namespace: <NAMESPACE>
spec:
objectStorage:
bucket: <CONTAINER_NAME>
provider: community.openstack.org/openstack
---
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
name: my-awesome-vsl
namespace: <NAMESPACE>
spec:
provider: community.openstack.org/openstack-cinder
config:
volumeTimeout: 5m
snapshotTimeout: 5m
cloneTimeout: 5m
backupTimeout: 5m
imageTimeout: 5m
ensureDeleted: "true"
ensureDeletedDelay: 10s
cascadeDelete: "true"
Then run velero backup with explicitly defined storage location and volume snapshot location. This will ensure that you don't backup using a different configuration (or provider):
velero backup create \
--namespace velero \
--include-namespaces default \
--snapshot-volumes true \
--storage-location my-awesome-bsl \
--volume-snapshot-locations my-awesome-vsl \
--wait
Can you try it this way?
(This example might miss something, I was writing it from my memory)
Ok great I was able to create a few successful backup by :
velero backup create test2345678 --include-namespaces=default --snapshot-volumes=true --storage-location=default --volume-snapshot-locations=manila --wait
velero backup create test2345678 --include-namespaces=default --snapshot-volumes=true --storage-location=default --volume-snapshot-locations=cinder --wait
2. Disabling the initContainers `velero-plugin-for-csi` and only keep the `velero-plugin-for-openstack`
```yaml
initContainers:
# - name: velero-plugin-for-csi
# image: velero/velero-plugin-for-csi:v0.7.0
# imagePullPolicy: IfNotPresent
# volumeMounts:
# - mountPath: /target
# name: plugins
- name: velero-plugin-for-openstack
image: lirt/velero-plugin-for-openstack:v0.6.0
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /target
name: plugins
configuration.features
(it will not work without it. On my side, it was throwing an authentification error)
Thanks a lot for your help ! Hopefully it will be able to help other people as well.
Great to hear that.
Describe the bug I'm deploying Velcro using helm chart, and I'm not able to successfully create a backup. They are always in the
PartiallyFailed
state.Steps to reproduce the behavior
Expected behavior A successful backup is created and uploaded to openstack Shared Object Storage.
Used versions
velero version
): velero/velero:v1.13.0kubectl describe pod velero-...
): lirt/velero-plugin-for-openstack:v0.6.0, velero/velero-plugin-for-csi:v0.7.0kubectl version
): 1.27.8Link to velero or backup log
My guess is that it's related to that step but I'm not able to exec into the container to run those command