wellcomecollection / goobi-infrastructure

Wellcome Collection digital workflow infrastructure
MIT License
0 stars 3 forks source link

Backup of cloud system #231

Closed mgeerdsen closed 4 years ago

mgeerdsen commented 4 years ago

Following components might be considered:

pollecuttn commented 4 years ago

@mgeerdsen we want to be reassured and have seen your plans for restoring before we are using AWS Goobi in production

mgeerdsen commented 4 years ago

datastores involved

ressources to be included in backup are bold

mgeerdsen commented 4 years ago

For EFS it appears that AWS Backup can be a good solution. Easy to set up, allowing automatic migration to cold storage, restore into the same FS as well as restore as new FS. A regular command line backup tool running as some kind of scheduled job, could also do the job of backing up the files to S3.

RDS could be backed up using RDS snapshots (increasing the current amount of snapshots saved). A different or even additional way would be a kind of scheduled job doing regular SQL dumps to S3 for example.

S3 is a bit harder, the highest risk here is probably accidental deletion by a user. This could be caught by enabling versioning and automatically dropping old versions after a defined time.

The above measures protect, to a certain limit, against accidental loss of data due to some kind of user or script error for example. For better protection against infrastructure problems, replication of backups into a different region would have to be considered.

kenoir commented 4 years ago

EFS it appears that AWS Backup can be a good solution 👍 RDS could be backed up using RDS snapshots 👍 Versioned S3 👍

Great - all sounds good.

kenoir commented 4 years ago

That all makes sense - do we need to make each of those mechanisms into a ticket to be worked on?

mgeerdsen commented 4 years ago

EFS and RDS are saved daily... restore to a new resource for testing purposes is currently running

mgeerdsen commented 4 years ago

restoring RDS/EFS to new resources has been tested and content checked successfully