longhorn / longhorn

Cloud-Native distributed storage built on and for Kubernetes
https://longhorn.io
Apache License 2.0
6.06k stars 596 forks source link

[FEATURE] Declarative Restore from Latest Backup #5787

Open steve-fraser opened 1 year ago

steve-fraser commented 1 year ago

Is your feature request related to a problem? Please describe (👍 if you like this request)

Today, it's difficult to leverage RecurringJob and also restore from latest automatically.

I want to automatically backup a statefulset PV, then restore from the latest on fresh installation.

Today I would need to do something like

---
apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
  labels:
    longhornvolume: statefulset-vol-0
  name: statefulset-vol-0
  namespace: longhorn-system
spec:
  createSnapshot: true
  volume: statefulset-vol-0

---
apiVersion: longhorn.io/v1beta2
kind: Backup
metadata:
  labels:
    backup-volume: statefulset-vol-0
  name: statefulset-vol-0
  namespace: longhorn-system
spec:
  snapshotName: statefulset-vol-0
---
apiVersion: longhorn.io/v1beta2
kind: Volume
metadata:
  labels:
    longhornvolume: statefulset-vol-0
    recurring-job-group.longhorn.io/default: enabled
    setting.longhorn.io/remove-snapshots-during-filesystem-trim: ignored
    setting.longhorn.io/replica-auto-balance: ignored
    setting.longhorn.io/snapshot-data-integrity: ignored
    recurring-job.longhorn.io/statefulset-vol-0: enabled
  name: statefulset-vol-0
  namespace: longhorn-system
spec:
  numberOfReplicas: 1
  disableFrontend: false
  engineImage: longhornio/longhorn-engine:v1.4.1
  fromBackup: "s3://longhorn-test-cluster@us-east-1/?backup=statefulset-vol-0&volume=statefulset-vol-0"
  frontend: blockdev
  size: "2147483648"
  staleReplicaTimeout: 20

But, this causes incompatibility with 'RecurringJob`

Describe the solution you'd like

A clear and concise description of what you want to happen

I would like to restore from latest like "s3://longhorn-test-cluster@us-east-1/?volume=statefulset-vol-0", which will located the latest backup taken form S3 and restore.

c3y1huang commented 1 year ago

Can you provide more information about the compatibility issue with the RecurringJob? Can you share the error message?

steve-fraser commented 1 year ago

if I use this configuration

apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
  generation: 1
  name: statefulset-vol-0
  namespace: longhorn-system
spec:
  concurrency: 1
  cron: '* * * * *'
  groups: []
  labels:
    longhornvolume: statefulset-vol-0
  name: statefulset-vol-0
  retain: 1
  task: backup
---
apiVersion: longhorn.io/v1beta2
kind: Volume
metadata:
  labels:
    longhornvolume: statefulset-vol-0
    recurring-job-group.longhorn.io/default: enabled
    setting.longhorn.io/remove-snapshots-during-filesystem-trim: ignored
    setting.longhorn.io/replica-auto-balance: ignored
    setting.longhorn.io/snapshot-data-integrity: ignored
    recurring-job.longhorn.io/statefulset-vol-0: enabled
  name: statefulset-vol-0
  namespace: longhorn-system
spec:
  numberOfReplicas: 1
  disableFrontend: false
  engineImage: longhornio/longhorn-engine:v1.4.1
  frontend: blockdev
  size: "2147483648"
  staleReplicaTimeout: 20
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: statefulset-vol-0
  namespace: default
spec:
  capacity:
    storage: 2Gi # must match size of Longhorn volume
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  csi:
    driver: driver.longhorn.io # driver must match this
    fsType: ext4
    volumeAttributes:
      numberOfReplicas: '1' # must match Longhorn volume value
      staleReplicaTimeout: '30' # in minutes
    volumeHandle: statefulset-vol-0 # must match volume name from Longhorn
  storageClassName: longhorn # must be same name that we will use later

it creates backups with backup-${ID}

🕙06:35:16 AM ❯ kubectl get backups -n longhorn-system 
NAME                      SNAPSHOTNAME                                    SNAPSHOTSIZE   SNAPSHOTCREATEDAT      STATE       LASTSYNCEDAT
backup-1930e4b5b2f64b11   stateful-17096474-0103-4c15-92db-24fa3676ef3b   115343360      2023-04-20T11:35:01Z   Completed   2023-04-20T11:35:05Z

This backup-${ID} is hard to recover from automatically on a new cluster standup because I have to located that id then commit to git with the ID value in the Volume.spec.fromBackup: "s3://longhorn-test-cluster@us-east-1/?backup=backup-${ID}&volume=statefulset-vol-0"

I would like to be able to define a volume without that backup-${id} and have it automatically restore from the latest backup

fromBackup: "s3://longhorn-test-cluster@us-east-1/?volume=statefulset-vol-0"
Volume/longhorn-system/test dry-run failed, reason: Invalid, error: admission webhook "mutator.longhorn.io" denied the request: failed to get backup : backup.longhorn.io "" not found
github-actions[bot] commented 9 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.