dokku / dokku

A docker-powered PaaS that helps you build and manage the lifecycle of applications
https://dokku.com
MIT License
29.24k stars 1.92k forks source link

Add a centralized backup and restore mechanism #5008

Open cweagans opened 2 years ago

cweagans commented 2 years ago

Description of problem

As a dokku admin, if I want to export a backup of an entire application and all of the dependencies, I need to know how that application works, whether or not the plugins that provide the backing services offer backup commands, etc. It would be much better if there were a mechanism by which an application and all of the attached services can be told to export whatever they need to export.

Possible Solution

I think a very lightweight solution could be put in place here: we just need six new plugn triggers:

The only thing Dokku needs to handle is dispatching those triggers + providing a directory that each plugin should write stuff to or read stuff from (as well as backing up/restoring core data, of course -- from what I can see, this is already handled in other commands so it would just need wired up).

Example:

If I'm running an application that was pushed to Git with a MySQL backing service and a persistent storage directory, here's what I'm imagining would happen when I trigger a backup:

When I ask to restore something, the reverse should happen. If I have an S3 backups plugin or w/e, it should be responsible for downloading the backup that I've specified and extracting it into the location where Dokku expects the backup data to be available. Each plugin is responsible for reading its own data from the backup dir.

Aside

I feel like this existed at some point and that I'm reproducing it (or something like it) from memory, but maybe I'm remembering wrong? Is there some reason that the core product should not provide this kind of plumbing? Backup and restore seems really important for a PaaS.

josegonzalez commented 2 years ago

We had this feature a very long time ago, but it was quite broken - it expected every plugin to write it's backup contents to stdout and then tar'd it all up, which ended up causing broken backups in some cases.

This is actually a bit more complex than you'd expect.

I'd previously tried to raise money via Github Sponsors for backup/restore functionality across the project, but didn't get to my goal of $500. That said, if you'd like to work on this functionality, by all means go ahead. I'd be happy to provide feedback once you have a skeleton working to get it to a place that is usable by core, datastore, and community plugins.

josegonzalez commented 2 years ago

Dokku is more or less at the $500 a month goal, so I'll start working on this in the next few releases. I think there is still some work to move a few files out of the git repositories (docker options, env vars, nginx configs) but those should be relatively straightforward to do.

Let me think about a good backup interface and then post my comments here. I'm guessing a few things will change about how datastores do backups, but I believe that will be a Good Thing™.

josegonzalez commented 1 year ago

Okay I think what I'll do is something like this:

backup:export [--app $APP] [--service $SERVICE] [--backup-dir TMP]

This new command will create a tar.gz file in the /tmp directory with the desired data. The /tmp dir may be overriden via the --backup-dir flag, and the file will be output there if so. Dokku will attempt to write a temporary file to the --backup-dir, and if it fails, the backup will exit immediately.

The backup will be buffered to a directory in /tmp while being generated. Users are expected to have enough space in /tmp to hold backup data.

If no app/service is specified, the backup will contain all apps and all services.

The full path to the tarball will be written to stdout.

Log messages for backup:export should always appear on stderr.

backup:import <backup-file-path> [--app $APP] [--service $SERVICE]

This new command will take a backup file path and an optional app/service name. If no app/service is specified, it will attempt to import everything.

This command may be destructive, and users are expected to take caution when running it. Confirmation of restoring on top of existing data will occur.

When restoring, if an app or service is not specified, the entire backup is first extracted before being restored. Thus, users should have enough space for 2x the size of the extracted backup.

Restores aren't guaranteed to work in all cases - particularly if dependencies are missing or there are external issues like a docker bug. Users will be encouraged to regularly test their backups on new servers.

cweagans commented 1 year ago

That sounds great! If I was going to extend this functionality to e.g. store backups on S3, would I use the post-backup and pre-import hooks? How would that plugin prune old backups? (Totally fine if the last bit is just a custom command or something)

josegonzalez commented 1 year ago

I think I would like to concentrate on this creating a tarball on disk. If someone wants to upload that somewhere, they can do so - similarly, if their backup process should prune old backups, they can handle that themselves.

To extend and do something custom with the file, you'd use a hook of some sort.

cweagans commented 1 year ago

Absolutely -- not suggesting that s3 storage be implemented in Dokku directly. Just thinking through the ways that this could be extended by other plugins -- if the answer is that s3 storage stuff should be done without interacting with the backup plugin directly, that's totally workable.

josegonzalez commented 1 year ago

I just want to avoid the issue we have with the datastore plugins where each one implements S3. Folks use things other than S3, so it makes us seem like users cant do something else.

I'd rather we document tools in the ecosystem to make those things possible, maybe even writing a tutorial for them.

cweagans commented 1 year ago

Totally understood! Appreciate the time and thought :)