deis / postgres

A PostgreSQL database used by Deis Workflow.
https://deis.com
MIT License
36 stars 22 forks source link

consider removing wal-e #95

Closed bacongobbler closed 8 years ago

bacongobbler commented 8 years ago

Right now we are using wal-e to back up WAL logs and replay them. However we are maintaining our own version of wal-e and we have hit several issues which have been difficult to debug (#94, #97, #67, #56, and #51 to name a few). We should consider if it might be worth using cloud storage clients directly (or object-storage-cli) to simplify usage.

arschles commented 8 years ago

@bacongobbler a few questions:

Also, should this be in RC1?

bacongobbler commented 8 years ago

Does wal-e stream the WAL to object storage, take snapshots, something else?

Yes and no. WAL-E has a command called wal-push that archives the WAL log as an lzop archive and ships the wal log to s3/minio/gcs when the archive command is invoked. Postgres invokes the archive command as necessary based on the size of the WAL log and the archive_timeout, which forces a WAL log to be shipped after X number of seconds.

Is it possible to use wal-e to replay the WAL but do our own thing to ship it to/from object storage?

Unless we ship the archive exactly as how WAL-E does it, then no. wal-e backup-fetch fetches the latest base backup from S3, applies it as the base, then replays WAL logs from that backup. We'd have to replicate that behaviour exactly.

Also, should this be in RC1?

Agreed. #94 is also in rc1, just forgot to add this one as well.

arschles commented 8 years ago

@bacongobbler thanks for all the info. sounds like it'll be involved to split the replay code from the storage code in wal-e. since wal-e includes the recovery logic and the cloud storage clients (or object-storage-cli) are missing that, do you have ideas for the way forward on that front?

bacongobbler commented 8 years ago

Changing the archive_command to use s3cmd and similarly for restore_command will achieve the same thing as we do today. Most production databases have a mounted disk at /mnt/backup-server and they just use cp to archive and restore to their persistent mounted disk. I predict that the same can be achieved on our end by replacing cp with s3cmd (famous last words).

The big downside to that is that we lose PITR (Point-In-Time-Recovery) which is the big selling point of WAL-E and one of our beta1 requirements, but at least our databases will recover gracefully without corrupting our S3 buckets and will be easier to debug when restoration fails.

arschles commented 8 years ago

@bacongobbler got it. if we copied the disk and the WAL using s3cmd (or similar), would that get us PITR?

From my cursory research on the postgres docs on continuous archiving: we can combine a file-system-level backup with backup of the WAL files

Also, tangentially related is that by doing the copy operation ourselves is we can control and debug everything from the log shipping to all points in the recovery process (fetch the base + logs, run recovery command, maybe do consistency checks, etc...).

Not sure if I'm missing something here that wal-e does, as I'm just going on the postgres archiving & recovery docs (linked above)

bacongobbler commented 8 years ago

Nice! So it seems we could replicate wal-e with object-storage-cli and pg_basebackup for the PITR, then upload that using object-storage-cli. I'll play around with this in rc1.

bacongobbler commented 8 years ago

An excellent tutorial explaining the process is shown here.

slack commented 8 years ago

I still feel strongly that attempting to re-implement database backups and wal-shipping at this stage seems like a really bad idea.

For example, we aren't sure if the missing apps are due to the fact that the database was killed after the data should have been shipped. In #97 it would be good to know that either archive_timeout or the size trigger had fired and shipped data to storage and was unable to successfully recover that data.

My vote is for more structured testing over replacing everything from scratch.

bacongobbler commented 8 years ago

But... I like re-writing everything from scratch so close to a stable release! ;)

That's a fair opinion. I'll see if there's a way we can reliably reproduce this issue and start drilling down from there.

bacongobbler commented 8 years ago

I was not able to reproduce the failure case as seen in https://github.com/deis/postgres/issues/94#issuecomment-219152845. At this point I'd rather not re-write the world until we've identified the issue.

bacongobbler commented 8 years ago

we're not going to remove wal-e, but lets continue the debugging in #94.