matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.79k stars 2.13k forks source link

Document how to back up a synapse server #2046

Open richvdh opened 7 years ago

richvdh commented 7 years ago

We should give users some guidance on what they need to do to effectively back up and restore a synapse.

Off the top of my head:

seanenck commented 7 years ago

definitely interested in this, we're currently doing the things you mention (well, a little more 'verbose' in that I'm pulling /etc/synapse/*)

nordurljosahvida commented 6 years ago

Absolutely agree, also interesting is discourse's self backup function that simply asks you for your S3 credentials and does everything by itself. That would be perfect. Thanks for the great work.

ghost commented 4 years ago

@richvdh any updates ?

richvdh commented 4 years ago

PRs welcome...

kpfleming commented 4 years ago

I'm about to do this; moving a Synapse installation from a FreeBSD jail to a Linux container (same CPU architecture, so the data should be compatible). The app configuration and logging configuration is already managed by Ansible so that part is easy, as is the NGINX proxy in front of it and the TLS configuration.

That leaves the database, media repository, and any keys for the server itself. Has anyone done this?

DamianoP commented 4 years ago

this is a very interesting question....

krystiancha commented 3 years ago

Has anyone done this?

Hey guys, I just moved my synapse instance and everything seems to work including message history and images uploaded in the past.

I transferred:

nicolamori commented 2 years ago

Hi, Is there any progress with this? I'm setting up Synapse + Postgres with docker-compose, and I'm not sure about how to create self-consistent, live, automated backups. In my understanding, to obtain consistent backups the Synapse server should be put in read-only mode or stopped while taking the backup, to avoid that some file is changed while the backup is ongoing. Is this correct? If yes, how to do so for a docker-compose-based setup? I cannot run backup scripts on the host machine and must do everything from within the container. Sorry for the probably dumb question but I'm a newcomer and I can't find any clear indication or example about this.

reivilibre commented 2 years ago

That's not correct actually, you don't need to turn off Synapse to make a consistent backup.

If you use pg_dump, you'll note from its manual that it says https://www.postgresql.org/docs/12/app-pgdump.html

pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers).

Basically it runs the entire backup in a single transaction, so Postgres gives it a consistent view the entire time. (However I do recommend restoring the backups offline and into a fresh, empty database with the correct locale settings. Be very careful not to restore into a database that already has tables present as this has led to issues in the past.)

If you're operating at a large scale, then making SQL dumps of your database is probably inefficient and too slow to restore, so you would probably be considering replication for your Postgres server (including having a hot standby). I can't really advise there myself as I'm not a database expert :-).

I cannot run backup scripts on the host machine and must do everything from within the container.

Curious; why not?

At some level you're going to need to be able to pg_dump your database and make some copies of your media store (and then probably put those backups somewhere so that you're not going to get messed up by a disk failure). I don't run databases in Docker so I'm not really sure, but I imagine the Docker way here is to have a container whose job is to run pg_dump and save the output somewhere. Maybe someone can chime in with how they do this in their docker-compose setup? Or perhaps you can find some example online; backing up a Postgres database is not Synapse-specific.

nicolamori commented 2 years ago

@reivilibre thanks for the quick and very detailed answer. Let me add some points and clarify some others:

Iruwen commented 2 years ago

My two cents: if you want to avoid most inconsistencies while keeping the server running, you probably need to configure replication and/or snapshots for both the volume holding the media as well as the postgres database. I.e. take a filesystem snapshot (supported by e.g. btrfs) and backup that, then keep or throw away the snapshot; setup replication for postgres and take a backup of the slave at the exact same time (with replication stopped, obviously, or just stop the slave and take a fs snapshot). Different cloud platforms have different ways to aid with the process (e.g. Amazon Fargate, RDS). One could also think about using something like https://github.com/matrix-org/synapse-s3-storage-provider with S3 or something compatible, e.g. a min.io cluster, to achieve maximum data availability and integrity. There's a plethora of ways to solve the problem to different degrees, which are all out of scope of Synapse itself. Even if it all works, there's still a probability that some buffered/cached data hasn't been written or replicated yet. The question when it comes to backups is "what is good enough". Ideally you avoid ever needing a backup to begin with, which would require HA capabilities, which Synapse doesn't have (yet).

reivilibre commented 2 years ago

@nicolamori

The backup will then contain the media file but not the database entry (I assume that uploaded images are registered in the DB). Would restoring from this backup lead to an inconsistent Synapse state?

This is true, but it's not a big deal — the only cost there is the wasted disk space if you restore from this backup and don't clean it out. If you back up your database first and then 'rsync' your media directory somewhere, your database will be consistent and Synapse won't necessarily ever notice. If you do the other way around, you might lose some media files that then have DB entries, but it may not be a big deal for your use case.

You'll probably find it much easier to keep it simple. Even if you lose some media that are tracked in the database, it's not going to be the end of the world — you might get errors downloading that piece of media but other than that, nothing too bad will happen.

@Iruwen makes some good points but I'd argue these are probably a lot more fiddly and complicated than many 'home users' care about — e.g. a loss of a few hours' worth of data isn't likely a big problem to me personally, so frequent pg_dumps are fine for me and I haven't bothered with database replication or storing media on a redundant cluster like minio.

nicolamori commented 2 years ago

@reivilibre thanks for the insights. I also understand and appreciate @Iruwen point of view, but I'd definitely keep it simple unless it might badly screw everything. Eventually loosing some media is not an issue for me, so I'd go with the plain pgsql dump (media are on Minio so I don't explicitly backup them).

Iruwen commented 2 years ago

One should maybe note that the system doesn't fall apart when there are inconsistencies between media and its references stored in the datase, it'll just be missing. Otherwise things like event/media retention policies would be a much bigger issue.

PS: replication is not a backup method - if you face any kind of data corruption, you'll end up with a distributed mess.

gwire commented 2 years ago

I'm currently backing up /etc/matrix-synapse/, a dump of the database, and the media directories.

The disk requirements are growing faster than I'd anticipated, so I was looking for documentation to tell me:

(I appreciate that remote resources can be withdrawn at any time, but I'm more interested in ensuring resources used for backups are used to be able to reestablish the local service.)

Does the database similarly contain remote server content, and if so is there a way to take a selective dump of local content in such a way that remote content would be repopulated on demand?

youphyun commented 2 years ago

I am still new to the topic. Simply started to backup the listed and relevant files including the full Postgres db using pg_dumpall. The whole process including locations to backup will differ depending on how synapse is installed (from repository, docker, or in a virtualenv). I am not sure if and when I will need to restore the backups. I am afraid that will be quite some manual work. I found these pages with some useful details: https://www.gibiris.org/eo-blog/posts/2022/01/21_containterise-synapse-postgres.html and https://ems-docs.element.io/books/element-cloud-documentation/page/import-database-and-media-dump One additional question to the media repo, will simply restoring the /media_store folder work or should it be rather done using the export and import Synapse API calls?

FarisZR commented 11 months ago

I'm currently backing up /etc/matrix-synapse/, a dump of the database, and the media directories.

The disk requirements are growing faster than I'd anticipated, so I was looking for documentation to tell me:

  • is it safe to skip backing up url_cache_thumbnails and url_cache? (the cache in the name suggests so) will these be repopulated if seen by clients?
  • is it safe to skip backing up *_thumbnails? If absent, will the server recalculate these files on demand?
  • is it safe to skip backing up remote_content? If absent, will the server repopulate these files on demand?

(I appreciate that remote resources can be withdrawn at any time, but I'm more interested in ensuring resources used for backups are used to be able to reestablish the local service.)

Does the database similarly contain remote server content, and if so is there a way to take a selective dump of local content in such a way that remote content would be repopulated on demand?

@gwire have you found an answer yet? because remote_content is crazy large in my server, and it doesn't make sense at all to back it up.

gwire commented 11 months ago

I'm not backing up remote_content. I have needed to recover after a disk issue, and found that, no, remote files don't auto repopulate (at least on v4.1.x) so I ended up writing scripts to run a lot of tootctl media refresh and tootctl accounts refresh commands, and also direct downloads for custom emoji (which I couldn't see how to. restore via tootctl).

kpfleming commented 11 months ago

tootctl is for Mastodon, not Synapse :-)

gwire commented 11 months ago

D'oh, yes. I have the same issue with Mastodon (backing up its remote content directories, or not) and I didn't check the context.