Velero Snapshots Stored in Wrong Resource Group for Azure AKS (MC_*)

archmangler commented 4 years ago

What steps did you take and what happened:

Velero managed disk snapshots are stored in the AKS managed resource group (MC_*) for the cluster instead of a specified snapshot location resource group (specified at velero install time).
This means that if the AKS cluster is deleted, backups (pv snapshots) will be lost, making DR recovery impossible. This is because the MC_* resource group is deleted along with the cluster when an AKS cluster is destroyed (especially with an infrastructure as code implementation)

Install and configuration:

Uninstall Velero and specify the snapshot location resource group explicitly as follows:

/usr/local/bin/velero install --provider azure \
  --bucket "lolcorpaz1aksbkp" \
  --secret-file "velero-credentials" \
  --image "velero/velero:v1.1.0" \
  --backup-location-config \
  resourceGroup="rsg-lolcorp-uat-az1-aksbkp",storageAccount="stalolcorpuataz1aksbkp"
  --snapshot-location-config \
  apiTimeout="1m",resourceGroup="rsg-lolcorp-uat-az1-aksbkp" \
  --velero-pod-cpu-limit "0" \
  --velero-pod-cpu-request "0" \
  --velero-pod-mem-limit "0" \
  --velero-pod-mem-request "0"
  --use-restic --wait"


AZURE_SUBSCRIPTION_ID="...."

AZURE_TENANT_ID="...."

ZURE_CLIENT_ID="...."

AZURE_CLIENT_SECRET="...."

AZURE_RESOURCE_GROUP="rsg-lolcorp-uat-az1-aksbkp"

AZURE_CLOUD_NAME="AzurePublicCloud"

Despite this specification, snapshots still appear in the AKS MC_... resource group (the "managed resource group")

What did you expect to happen:

PV snapshots for managed disk will be stored in the resource group specified in the snapshotlocation specified in the install configuration.
This will provide the correct level of DR suitable for enterprise organisations

The output of the following commands will help us better understand what's going on: (Pasting long output into a GitHub gist or other pastebin is fine.)


[345401@asgprholpans001 tests]$ kubectl logs deployment/velero -n velero --follow| egrep "pvc-24d7d033-5e91-11ea-9866-929bf458f004"   ^C

[345401@asgprholpans001 tests]$ kubectl logs deployment/velero -n velero --follow| egrep "75f44b4c-5f64-11ea-9866-929bf458f004"

time="2020-03-06T05:00:31Z" level=info msg="Backing up item" backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:162" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default resource=pods

time="2020-03-06T05:00:31Z" level=info msg="Executing takePVSnapshot" backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:375" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default resource=pods

time="2020-03-06T05:00:31Z" level=info msg="Skipping persistent volume snapshot because volume has already been backed up with restic." backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:393" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default persistentVolume=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 resource=pods

Configuration of the velero snapshot location

[tests]$ velero get snapshot-locations -o yaml

apiVersion: velero.io/v1

kind: VolumeSnapshotLocation

metadata:

  creationTimestamp: 2020-03-06T04:31:16Z

  generation: 1

  labels:

    component: velero

  name: default

  namespace: velero

  resourceVersion: "169404"

  selfLink: /apis/velero.io/v1/namespaces/velero/volumesnapshotlocations/default

  uid: 517763a6-5f63-11ea-9866-929bf458f004

spec:

  config:

    apiTimeout: 1m

    resourceGroup: rsg-lolcorp-uat-az1-aksbkp

  provider: azure

status: {}

kubectl logs deployment/velero -n velero
velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yaml
velero restore logs <restorename>

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

Velero version (use velero version):
Velero features (use velero client config get features):
Kubernetes version (use kubectl version):
Kubernetes installer & version:
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):

skriss commented 4 years ago

@archmangler based on the following log line:

time="2020-03-06T05:00:31Z" level=info msg="Skipping persistent volume snapshot because volume has already been backed up with restic." backup=velero/backup-schedule-frequent-snapshot-test-20200306050028 group=v1 logSource="pkg/backup/item_backupper.go:393" name=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 namespace=default persistentVolume=pvc-75f44b4c-5f64-11ea-9866-929bf458f004 resource=pods

It doesn't sound like you're taking managed snapshots, you're using restic, in which case the VolumeSnapshotLocation is irrelevant. Please clarify.

archmangler commented 4 years ago

Hi @skriss - You are correct, I had installed with --restic, and had the following annotation:

backup.velero.io/backup-volumes: <volumename>

However, even in this case, how can I tell restic to use a designated resource group as the location to store its snapshots?

Next, I have now reinstalled without the restic plugin, and my intention is to have "Velero Managed Disk Snapshots" without restic. Now I am not getting any snapshots, restic or otherwise:

/usr/local/bin/velero install --provider azure \
  --bucket "lolcorpaz1aksbkp" \
  --secret-file "velero-credentials" \
  --image "velero/velero:v1.1.0" \
  --backup-location-config \
  resourceGroup="rsg-lolcorp-uat-az1-aksbkp",storageAccount="stalolcorpuataz1aksbkp"
  --snapshot-location-config \
  apiTimeout="1m",resourceGroup="rsg-lolcorp-uat-az1-aksbkp" \
  --velero-pod-cpu-limit "0" \
  --velero-pod-cpu-request "0" \
  --velero-pod-mem-limit "0" \
  --velero-pod-mem-request "0"
  --wait"

My test pod configuration:

apiVersion: v1

kind: Pod

metadata:

  name: mdvelerotest3

  namespace: default

spec:

  containers:

  - args:

    - "10000"

    command:

    - sleep

    image: velero/velero:v1.1.0

    imagePullPolicy: IfNotPresent

    name: testmd2

    volumeMounts:

      - mountPath: "/mnt/"

        name: mdstorage3

  volumes:

    - name: mdstorage3

      persistentVolumeClaim:

       claimName: mdsnapshotest3

  imagePullSecrets:

  - name: docker-release.lolcorp

---

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

  name: mdsnapshotest3

  namespace: default

spec:

  accessModes:

    - ReadWriteOnce

  storageClassName: "lolcorp-aks-managed-disk"

  resources:

    requests:

      storage: 100G

NOTE: I'm running pods with both Azure Files and Managed disk PVs and need snapshots for both. My understanding is:

I can only get snapshots for azure files with the restic plugin (hence --restic option at install time),
but I can get snapshots for managed disk without installing restic

However, this raises new questions:

a) If I install the restic plugin for azure files snapshot, how do I exclude managed disk from restic snapshotting but include it in veleros normal snapshotting? b) How do I tell restic to use the storage account I specify? c) The managed disk snapshots are not happening without restic installed, how to enable these (without restic)?

Ideally, I'd like a configuration that allows:

a) Azure files snapshots (with restic as I understand this the only way to get this) b) Managed disk snapshots (with any other velero mechanism) c) All snapshots go to a Resource Group I specify and not the MC_ AKS resource group.

skriss commented 4 years ago

My understanding is: I can only get snapshots for azure files with the restic plugin (hence --restic option at install time), but I can get snapshots for managed disk without installing restic

That is correct.

a) If I install the restic plugin for azure files snapshot, how do I exclude managed disk from restic snapshotting but include it in veleros normal snapshotting?

You will get your desired behavior by default, assuming Velero is configured correctly. Specifically, as long as you don't add the restic annotation (backup.velero.io/backup-volumes) to the pod, the PV will be snapshotted by default (again, assuming things are configured correctly).

b) How do I tell restic to use the storage account I specify?

All restic data is stored in the same bucket/blob container as the rest of the main velero backup data/metadata. That can be in whatever storage account you want. You specify it via the config.storageAccount field on the BackupStorageLocation -- see https://github.com/vmware-tanzu/velero-plugin-for-microsoft-azure/blob/master/backupstoragelocation.md.

c) The managed disk snapshots are not happening without restic installed, how to enable these (without restic)?

Something is likely misconfigured. Can you provide (preferably in a gist):

the output of velero snapshot-location get -o yaml
the output of velero plugin get
the output of velero backup get <BACKUP-NAME> -o yaml for the backup where you expected a snapshot, but did not get it
the output of velero backup logs <BACKUP-NAME> for the same backup
the output of velero backup describe <BACKUP-NAME> --details for the same backup

That should help us start debugging. Thanks!

archmangler commented 4 years ago

Hi @skriss - I've pasted the information you requested here:

https://gist.github.com/archmangler/31397f8f56728d1880ad9ad526010d84

skriss commented 4 years ago

@archmangler thanks for the info.

I see that your Azure managed disk PV is named pvc-0f862e2f-53bd-4500-b563-23548c935fd5.

In your logs, I see:

time="2020-03-09T02:51:41Z" level=info msg="Skipping persistent volume snapshot because volume has already been backed up with restic." backup=velero/test-backup group=v1 logSource="pkg/backup/item_backupper.go:393" name=pvc-0f862e2f-53bd-4500-b563-23548c935fd5 namespace=backuptests persistentVolume=pvc-0f862e2f-53bd-4500-b563-23548c935fd5 resource=pods

This tells me that in the pod that uses this PV, you still have an annotation indicating that this volume should be backed up with restic, i.e. backup.velero.io/backup-volumes: <VOLUME-NAME>. You need to remove the managed disk volume's name from this annotation in order for it to be snapshotted natively.

Separately, I see that the Azure File volume is also annotated to be backed up with restic, but you don't have the restic daemonset installed so it's never getting processed. You need to either remove the annotation (on the pod that uses the volume, backup.velero.io/backup-volumes: <VOLUME-NAME>) so Velero doesn't attempt to back up this volume with restic, OR you need to install the restic daemonset (--use-restic flag to velero install command).

skriss commented 4 years ago

@archmangler were you able to resolve this?

archmangler commented 4 years ago

Hi @skriss - I've been rebuilding my test set up to reproduce this issue on a clean cluster. I will paste the new results tomorrow.

archmangler commented 4 years ago

Hi @skriss - I've pasted debug information here from a fresh cluster:

https://github.com/vmware-tanzu/velero/issues/2328

Notes:

I don't see snapshot PVs appearing in the MC_* group anymore.
I do see references to snapshot objects in the designated backup container, e.g:

17:42:18  backups/backup-schedule-daily7d168h-20200313191036/backup-schedule-daily7d168h-20200313191036-volumesnapshots.json.gz                            BlockBlob                 29        application/octet-stream  2020-03-13T19:10:41+00:00

17:42:18  backups/backup-schedule-daily7d168h-20200318174118/backup-schedule-daily7d168h-20200318174118-volumesnapshots.json.gz                            BlockBlob                 29        application/octet-stream  2020-03-18T17:41:24+00:00

17:42:18  backups/backup-schedule-daily7d168h-20200319010008/backup-schedule-daily7d168h-20200319010008-volumesnapshots.json.gz                            BlockBlob                 29        application/octet-stream  2020-03-19T07:10:14+00:00

17:42:18  backups/backup-schedule-daily7d168h-20200319092353/backup-schedule-daily7d168h-20200319092353-volumesnapshots.json.gz                            BlockBlob                 29        application/octet-stream  2020-03-19T09:23:58+00:00

17:42:18  backups/backup-schedule-sunday7d168h-20200318174121/backup-schedule-sunday7d168h-20200318174121-volumesnapshots.json.gz                          BlockBlob                 29        application/octet-stream  2020-03-18T17:41:29+00:00

17:42:18  backups/backup-schedule-sunday7d168h-20200319092355/backup-schedule-sunday7d168h-20200319092355-volumesnapshots.json.gz                          BlockBlob                 29        application/octet-stream  2020-03-19T09:24:04+00:00

17:42:18  backups/backup-schedule-wednesday7d168h-20200318174121/backup-schedule-wednesday7d168h-20200318174121-volumesnapshots.json.gz                    BlockBlob                 29        application/octet-stream  2020-03-18T17:41:35+00:00

17:42:18  backups/backup-schedule-wednesday7d168h-20200319092355/backup-schedule-wednesday7d168h-20200319092355-volumesnapshots.json.gz                    BlockBlob                 29        application/octet-stream  2020-03-19T09:24:09+00:00

Installation was done without --restic:

/velero install --provider azure --bucket plnszcaz1aksbkp --secret-file velero-credentials --image docker-release.lolcorp.lolcorp.com:8443/velero/velero:v1.1.0 --backup-location-config resourceGroup=rsg-lolcorp-dev-az1-aksbkp,storageAccount=stalolcorpdevaz1aksbkp --snapshot-location-config apiTimeout=1m,resourceGroup=rsg-lolcorp-dev-az1-aksbkp --velero-pod-cpu-limit 0 --velero-pod-cpu-request 0 --velero-pod-mem-limit 0 --velero-pod-mem-request 0 –wait

However, I do see mentions of restic snapshots, for those PVS with annotations in the logs:

me="2020-03-19T10:37:21Z" level=info msg="Adding pvc azurefilesnapshotest2 to additionalItems" backup=velero/backuptests-lite cmd=/v

elero logSource="pkg/backup/pod_action.go:67" pluginName=velero

time="2020-03-19T10:37:21Z" level=info msg="Backing up item" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/item_backup

per.go:162" name=azurefilesnapshotest2 namespace=backuptests resource=pods

time="2020-03-19T10:37:21Z" level=info msg="Executing custom action" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/ite

m_backupper.go:310" name=azurefilesnapshotest2 namespace=backuptests resource=pods

time="2020-03-19T10:37:21Z" level=info msg="Executing takePVSnapshot" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/it

em_backupper.go:375" name=pvc-d1c1a9d2-69cb-11ea-94a1-5ebe98412ed4 namespace=backuptests resource=pods

time="2020-03-19T10:37:21Z" level=info msg="Skipping persistent volume snapshot because volume has already been backed up with restic.

" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/item_backupper.go:393" name=pvc-d1c1a9d2-69cb-11ea-94a1-5ebe98412ed4 n

amespace=backuptests persistentVolume=pvc-d1c1a9d2-69cb-11ea-94a1-5ebe98412ed4 resource=pods

time="2020-03-19T11:37:21Z" level=info msg="Adding pvc mdsnapshotest to additionalItems" backup=velero/backuptests-lite cmd=/velero lo

gSource="pkg/backup/pod_action.go:67" pluginName=velero

time="2020-03-19T11:37:22Z" level=info msg="Backing up item" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/item_backup

per.go:162" name=mdsnapshotest namespace=backuptests resource=pods

time="2020-03-19T11:37:22Z" level=info msg="Executing custom action" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/ite

m_backupper.go:310" name=mdsnapshotest namespace=backuptests resource=pods

time="2020-03-19T11:37:22Z" level=info msg="Executing takePVSnapshot" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/it

em_backupper.go:375" name=pvc-c7d25291-69cb-11ea-94a1-5ebe98412ed4 namespace=backuptests resource=pods

time="2020-03-19T11:37:22Z" level=info msg="Skipping persistent volume snapshot because volume has already been backed up with restic.

" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/item_backupper.go:393" name=pvc-c7d25291-69cb-11ea-94a1-5ebe98412ed4 n

amespace=backuptests persistentVolume=pvc-c7d25291-69cb-11ea-94a1-5ebe98412ed4 resource=pods

time="2020-03-19T11:37:22Z" level=info msg="Adding pvc mdsnapshotest2 to additionalItems" backup=velero/backuptests-lite cmd=/velero l

ogSource="pkg/backup/pod_action.go:67" pluginName=velero

time="2020-03-19T11:37:22Z" level=info msg="Backing up item" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/item_backup

per.go:162" name=mdsnapshotest2 namespace=backuptests resource=pods

time="2020-03-19T11:37:22Z" level=info msg="Executing custom action" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/ite

m_backupper.go:310" name=mdsnapshotest2 namespace=backuptests resource=pods

time="2020-03-19T11:37:22Z" level=info msg="Executing takePVSnapshot" backup=velero/backuptests-lite group=v1 logSource="pkg/backup/it

em_backupper.go:375" name=pvc-ca06433c-69cb-11ea-94a1-5ebe98412ed4 namespace=backuptests resource=pods

time="2020-03-19T11:37:23Z" level=info msg="Got volume ID for persistent volume" backup=velero/backuptests-lite group=v1 logSource="pk

g/backup/item_backupper.go:426" name=pvc-ca06433c-69cb-11ea-94a1-5ebe98412ed4 namespace=backuptests persistentVolume=pvc-ca06433c-69cb

-11ea-94a1-5ebe98412ed4 resource=pods volumeSnapshotLocation=default

Since I see no PV snapshots in the resource group as block-level/file-level snapshots, is it correct to assume these exist as block blobs in the storage container itself?

skriss commented 4 years ago

Two things:

if you don't have the restic daemonset installed and you're not using it, you should remove the restic annotations from your pods
I see at least one error from your Azure Disk snapshot:

time="2020-03-19T11:37:23Z" level=error msg="Error backing up item" backup=velero/backuptests-lite error="error getting volume info: r

pc error: code = Unknown desc = compute.DisksClient#Get: Failure responding to request: StatusCode=404 -- Original Error: autorest/azu

re: Service returned an error. Status=404 Code=\"ResourceNotFound\" Message=\"The Resource 'Microsoft.Compute/disks/kubernetes-dynamic

-pvc-ca06433c-69cb-11ea-94a1-5ebe98412ed4' under resource group 'rsg-lolcorp-dev-az1-aksbkp' was not found.\"" group=v1 logSource="pkg/

backup/resource_backupper.go:264" name=mdvelerotest2 namespace=backuptests resource=pods

This implies that in the secret, you set AZURE_RESOURCE_GROUP to rsg-lolcorp-dev-az1-aksbkp, not the AKS auto-generated resource group where your disks actually are. This is incorrect.

Please see the documentation for setting this up correctly: https://github.com/vmware-tanzu/velero-plugin-for-microsoft-azure#get-resource-group-for-persistent-volume-snapshots

archmangler commented 4 years ago

Hi @skriss - I 've corrected the resource group in the velero credential file and removed the annotation in the 2 pods I deployed (before running the backups). Here is the debug output from a a fresh installation:

https://gist.github.com/archmangler/50e9b50ff212427f540e62b5b263ab66

I still see references to restic in the current logs even though the notation is not there"

apiVersion: v1

kind: Pod

metadata:

name: afsvelerotest

namespace: backuptests

spec:

containers:

args:
- "100000"
command:
- sleep
image: docker-release.lolcorp.lolcorp.coma:8443/velero/velero:v1.1.0

imagePullPolicy: IfNotPresent

name: test

volumeMounts:
- mountPath: "/mnt/"
  
  name: afsstorage
volumes:
- name: afsstorage
  
  persistentVolumeClaim:
  
  claimName: azurefilesnapshotest2
imagePullSecrets:
name: docker-release.lolcorp.lolcorp.coma

apiVersion: v1

kind: Pod

metadata:

name: afsvelerotest

namespace: backuptests

spec:

containers:

args:
- "100000"
command:
- sleep
image: docker-release.lolcorp.lolcorp.coma:8443/velero/velero:v1.1.0

imagePullPolicy: IfNotPresent

name: test

volumeMounts:
- mountPath: "/mnt/"
  
  name: afsstorage
volumes:
- name: afsstorage
  
  persistentVolumeClaim:
  
  claimName: azurefilesnapshotest2
imagePullSecrets:
name: docker-release.lolcorp.lolcorp.coma

skriss commented 4 years ago

Can you paste the full output of kubectl -n backuptests get pods -o yaml?

archmangler commented 4 years ago

Hi @skriss - as below

Arg, I see It. Let me fix that.

apiVersion: v1

items:

- apiVersion: v1

  kind: Pod

  metadata:

    annotations:

      backup.velero.io/backup-volumes: afsstorage

      kubectl.kubernetes.io/last-applied-configuration: |

        {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"afsvelerotest","namespace":"backuptests"},"spec":{"containers":[{"args":["100000"],"command":["sleep"],"image":"docker-release.lolcorp.lolcorp.com:8443/velero/velero:v1.1.0","imagePullPolicy":"IfNotPresent","name":"test","volumeMounts":[{"mountPath":"/mnt/","name":"afsstorage"}]}],"imagePullSecrets":[{"name":"docker-release.lolcorp.lolcorp.com"}],"volumes":[{"name":"afsstorage","persistentVolumeClaim":{"claimName":"azurefilesnapshotest2"}}]}}

      kubernetes.io/psp: privileged

    creationTimestamp: "2020-03-19T16:40:10Z"

    name: afsvelerotest

    namespace: backuptests

    resourceVersion: "5360"

    selfLink: /api/v1/namespaces/backuptests/pods/afsvelerotest

    uid: 4c50ffdf-6a00-11ea-a0f5-4681ac8d14e2

  spec:

    containers:

    - args:

      - "100000"

      command:

      - sleep

      image: docker-release.lolcorp.comlolcorp.lolcorp.com:8443/velero/velero:v1.1.0

      imagePullPolicy: IfNotPresent

      name: test

      resources: {}

      terminationMessagePath: /dev/termination-log

      terminationMessagePolicy: File

      volumeMounts:

      - mountPath: /mnt/

        name: afsstorage

      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount

        name: default-token-szkz2

        readOnly: true

        dnsPolicy: ClusterFirst

        enableServiceLinks: true

        imagePullSecrets:

        - name: docker-release.lolcorp.lolcorp.com

        nodeName: aks-aksaz1np0-22735939-vmss000000

        priority: 0

        restartPolicy: Always

        schedulerName: default-scheduler

        securityContext: {}

        serviceAccount: default

        serviceAccountName: default

        terminationGracePeriodSeconds: 30

        tolerations:

        - effect: NoExecute

          key: node.kubernetes.io/not-ready

          operator: Exists

          tolerationSeconds: 300

        - effect: NoExecute

          key: node.kubernetes.io/unreachable

          operator: Exists

          tolerationSeconds: 300

        volumes:

        - name: afsstorage

          persistentVolumeClaim:

            claimName: azurefilesnapshotest2

        - name: default-token-szkz2

          secret:

            defaultMode: 420

            secretName: default-token-szkz2

      status:

        conditions:

        - lastProbeTime: null

          lastTransitionTime: "2020-03-19T16:40:31Z"

          status: "True"

          type: Initialized

        - lastProbeTime: null

          lastTransitionTime: "2020-03-19T16:40:34Z"

          status: "True"

          type: Ready

        - lastProbeTime: null

          lastTransitionTime: "2020-03-19T16:40:34Z"

          status: "True"

          type: ContainersReady

        - lastProbeTime: null

          lastTransitionTime: "2020-03-19T16:40:31Z"

          status: "True"

          type: PodScheduled

        containerStatuses:

        - containerID: docker://ec220bf7b8b367f12b84cfd3e0258fbebcea7154d6121af0c5d329c167ea4cc9

          image: docker-release.lolcorp.lolcorp.com:8443/velero/velero:v1.1.0

          imageID: docker-pullable://docker-release.lolcorp.lolcorp.com:8443/velero/velero@sha256:e35ea9ebcaaa4c4d256a04698b2c337cf8f10d2cc359497468014e4a7e39ee19

          lastState: {}

          name: test

          ready: true

          restartCount: 0

          state:

            running:

              startedAt: "2020-03-19T16:40:33Z"

        hostIP: 10.155.240.4

        phase: Running

        podIP: 10.155.240.54

        qosClass: BestEffort

        startTime: "2020-03-19T16:40:31Z"

    - apiVersion: v1

      kind: Pod

      metadata:

        annotations:

          backup.velero.io/backup-volumes: mdstorage

          kubectl.kubernetes.io/last-applied-configuration: |

            {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"mdvelerotest","namespace":"backuptests"},"spec":{"containers":[{"args":["100000"],"command":["sleep"],"image":"docker-release.lolcorp.lolcorp.com:8443/velero/velero:v1.1.0","imagePullPolicy":"IfNotPresent","name":"testmd","volumeMounts":[{"mountPath":"/mnt/","name":"mdstorage"}]}],"imagePullSecrets":[{"name":"docker-release.lolcorp.lolcorp.com"}],"volumes":[{"name":"mdstorage","persistentVolumeClaim":{"claimName":"mdsnapshotest"}}]}}

            kubernetes.io/psp: privileged

          creationTimestamp: "2020-03-19T16:39:52Z"

          name: mdvelerotest

          namespace: backuptests

          resourceVersion: "5437"

          selfLink: /api/v1/namespaces/backuptests/pods/mdvelerotest

          uid: 41e8471b-6a00-11ea-a0f5-4681ac8d14e2

        spec:

          containers:

          - args:

            - "100000"

            command:

            - sleep

            image: docker-release.lolcorp.lolcorp.com:8443/velero/velero:v1.1.0

            imagePullPolicy: IfNotPresent

            name: testmd

            resources: {}

            terminationMessagePath: /dev/termination-log

            terminationMessagePolicy: File

            volumeMounts:

            - mountPath: /mnt/

              name: mdstorage

            - mountPath: /var/run/secrets/kubernetes.io/serviceaccount

              name: default-token-szkz2

              readOnly: true

          dnsPolicy: ClusterFirst

          enableServiceLinks: true

          imagePullSecrets:

          - name: docker-release.lolcorp.lolcorp.com

          nodeName: aks-aksaz1np0-22735939-vmss000001

          priority: 0

          restartPolicy: Always

          schedulerName: default-scheduler

          securityContext: {}

          serviceAccount: default

          serviceAccountName: default

          terminationGracePeriodSeconds: 30

          tolerations:

          - effect: NoExecute

            key: node.kubernetes.io/not-ready

            operator: Exists

            tolerationSeconds: 300

          - effect: NoExecute

            key: node.kubernetes.io/unreachable

            operator: Exists

            tolerationSeconds: 300

          volumes:

          - name: mdstorage

            persistentVolumeClaim:

              claimName: mdsnapshotest

          - name: default-token-szkz2

            secret:

              defaultMode: 420

              secretName: default-token-szkz2

        status:

          conditions:

          - lastProbeTime: null

            lastTransitionTime: "2020-03-19T16:40:26Z"

            status: "True"

            type: Initialized

          - lastProbeTime: null

            lastTransitionTime: "2020-03-19T16:41:46Z"

            status: "True"

            type: Ready

          - lastProbeTime: null

            lastTransitionTime: "2020-03-19T16:41:46Z"

            status: "True"

            type: ContainersReady

          - lastProbeTime: null

            lastTransitionTime: "2020-03-19T16:40:26Z"

            status: "True"

            type: PodScheduled

          containerStatuses:

          - containerID: docker://1ce5c86b5cf7b96e1d9f9d75a063c8ff4936aa87400c5ee4915d72369d4aa4b9

          image: docker-release.lolcorp.lolcorp.com:8443/velero/velero:v1.1.0

          imageID: docker-pullable://docker-release.lolcorp.lolcorp.com:8443/velero/velero@sha256:e35ea9ebcaaa4c4d256a04698b2c337cf8f10d2cc359497468014e4a7e39ee19

          lastState: {}

          name: testmd

          ready: true

          restartCount: 0

          state:

            running:

              startedAt: "2020-03-19T16:41:46Z"

        hostIP: 10.155.240.55

        phase: Running

        podIP: 10.155.240.70

        qosClass: BestEffort

        startTime: "2020-03-19T16:40:26Z"

    kind: List

    metadata:

      resourceVersion: ""

      selfLink: ""

archmangler commented 4 years ago

hi @skriss - new output, this time with annotations carefully removed from both test pods:

https://gist.github.com/archmangler/e881dcfb31841e1b31f2a75186acbfec

skriss commented 4 years ago

🎉 looks like you got a snapshot for your managed disk - if you do velero backup describe backuptests-lite --details, you'll be able to see the snapshot identifier and confirm which resource group it ended up in.

archmangler commented 4 years ago

Brilliant! And in the right resource group! Many thanks @skriss this resolves my issue.

skriss commented 4 years ago

awesome, glad you got it working :) I'll close this out.

KurtSchenk commented 1 year ago

Does this work across BackupStorageLocation is a different region, or the backup of the disk, and then the restore, in order to support DR?

Satsank commented 1 year ago

Yes, file system backups can go to a BSL in a different region and used to perform DR.

vmware-tanzu / velero

Velero Snapshots Stored in Wrong Resource Group for Azure AKS (MC_*) #2328