r-raymond / nixos-mailserver

A complete and Simple Nixos Mailserver
GNU General Public License v3.0
181 stars 27 forks source link

Implement automatic backup system #34

Closed r-raymond closed 6 years ago

r-raymond commented 7 years ago

Maybe with a local/offsite option using rsync

phdoerfler commented 7 years ago

Very good idea. Additional thoughts: there is rsnapshot which provides backup functionality using rsync. Also I personally find it nice if my software provides me with simple stats such as „deleted 842 messages since yesterday“ so I can spot when something goes horribly wrong. Or else I might not notice the need to restore a backup for when it is too late. But that‘s of course fancy extras.

r-raymond commented 7 years ago

Awesome idea. Seems like awstats would be a good candidate for that. Form a first glance it seems as we would need to get some patches into nixpkgs to make this work.

jbboehr commented 7 years ago

I was looking into a full-ish backup for my system (/var /home /etc/nixos), and I'm partial to duplicity's support for one-way encryption using a GPG public key. Most other systems don't seem to support this, probably for technical/convenience reasons.

I looked into all of the backup systems in nixpkgs and none of them were easy to configure and had support for one-way encryption.

This seems like it could be something more generally useful, unless there were some specific optimizations you were planning for email.

r-raymond commented 7 years ago

Interesting thought. I don't have anything special planned, but it should be a simple "i want backups" tick box for SNM, since most users probably don't care where the stuff to back up is stored.

So I guess we could add a simple systemd module that periodically calls duplicity.

Do you have any idea how to do key management? Should the GPG key be created on the fly? Also we would need to remind the user to make a manual backup of the key :)

jbboehr commented 7 years ago

I guess practically speaking asymmetric encryption might not be that useful since the data is already located where the private key would be stored, unless there's value in making deleted data inaccessible to the server, or I'm not thinking of something.

If one was to do one-way asymmetric encryption, I assume you would specify the public key as part of the configuration and keep the private key offline only for restoring backups, otherwise you could use symmetric encryption or just generate the keys.

I'm not particularly fond of duplicity, except that it does support encrypting backups with only a GPG public key, so if we don't think that's valuable, another option may be preferable. I do think some kind of encryption is important, however.

My main original point was just that backups are useful in general and adding the mail directory to the system backup would be trivial, so the "best" setup would be a solid general backup solution that the mail directory can be included in. If one of the solutions already in NixOS meets these criteria, then we could add simple settings or documentation on using it in conjunction with this project.

jbboehr commented 7 years ago

FYI, I'm going with tarsnap on my personal setup for the moment.

r-raymond commented 7 years ago

Well if someone is on the server then yes, but I guess the idea is that the backups can be put somewhere you do not trust.

Looking at the implemented services in Nixos (basically tarsnap and rsnapshot) I'd rule out the first for being a paid service and the second for not allowing to push backups from the server.

So I think a custom cron script or systemd unit is probably the best option right now.

jbboehr commented 7 years ago

the idea is that the backups can be put somewhere you do not trust.

Well, you can do that with symmetric encryption.

phdoerfler commented 6 years ago

@r-raymond regarding rsnapshot: I would have just set up something on my local computer that would every once in a while tar the rsnapshot backup and pull it. And then it gets backed up by timemachine locally, again. But I suppose you'd want something that does not require anything additional on a client computer? If you want the backups to be somewhere off-site it sounds like you want something like tarsnap but in free? Or is your plan to push backups to a local computer?

r-raymond commented 6 years ago

I think it would be convenient if one could simply specify an IP and the tar's get copied there via ssh. If one really wants local backups, specifying "backup@localhost" would work too then.

I just think the backup's should be pushed rather than pulled, because the latter would require some logic on an external server, whereas the former can be implemented on the mail server.

phdoerfler commented 6 years ago

Another idea: https://wiki2.dovecot.org/Tools/Doveadm/Sync allows you to perform backups. This relies on dovecot which already handles the mail anyway. With this one can easily convert the Maildirs into mboxes, which are great for archiving. This is actually pointed out by the dovecot documentation. One advantage I see is that this is potentially more consistent (as in atomic operation) than say copying the contents of the Maildir while dovecot may or may be not shuffling things around in there.

r-raymond commented 6 years ago

This is an awesome idea, I'll look into using this for implementing the backup system. Thanks for the link!

phdoerfler commented 6 years ago

Bear in mind though that dsync seems to be a bit more involved in terms of configuration.

I threw together a quick backup for my own instance of this Mailserver with rsnapshot and that works really well so far. Shall I submit this as a PR? It’s dead simple and we could make it a nixos module that can be enabled independently. At the very least it provides diffs that allow to check what went wrong and to go back in time. Also you can add a script that gets executed after rsnapshot ran and that could then copy the backup to a remote host via scp for instance. It’s beautifully simple, really and very much better than nothing I believe.

r-raymond commented 6 years ago

Sounds great, looking forward to the PR :)

phdoerfler commented 6 years ago

Just as a heads up: I'm currently working on this. I am improving my current solution which solely used rsnapshot without any additional fancyness and I'd like to outline my thought process on this:

r-raymond commented 6 years ago

Cool. Did you get around the problem with the pushing vs. pulling? I.e. is the backup logic on the server?

phdoerfler commented 6 years ago

TL;DR: dsync is not as awesome as I thought, so this will take a bit longer.

@r-raymond I have not yet tackled the pushing vs. pulling. I don't think this will be a problem at all. I'm completely counting on rsnapshots ability to launch custom scripts during and at the beginning / end of the backup process.

I have come up with a rather involved solution using both dovecot's dsync and some of nix's functionality. During that process I discovered that dsync apparently does not lock the maildir while exporting contrary to my own assumption. This was a bit of a let down. From dsync's man page:

  1. Run doveadm sync once to do the initial conversion.
  2. Run doveadm sync again, because the initial conversion could have taken a while and new changes could have occurred during it. This second time only applies changes, so it should be fast.

Note: dsync is merely an alias for doveadm sync. Also this is about converting a mailbox to a different format. However I assume that dsync's backup code path suffers from the same problem.

There are multiple ways on how to move on from here. At the very least I learned a lot about writing nix expressions so the hours I poured into this are not completely wasted.

One could:

There's really two sides to the coin here. When using vanilla rsnapshot the backup will be very fast (typically less than a second on my vultr VPS). Backing up using a bash script in the process is a lot slower and will need more disk space temporarily since dsync will create one complete copy of the maildirs temporarily before rsnapshot then thows away everything that remained unchanged. But when implementing the locking this will be consistent. And if in doubt I gravitate towards slow but consistent.

Thoughts?

Also seeing how this became rather long and is more of a conversational nature: Might I suggest creating a gitter channel for this project?

r-raymond commented 6 years ago

I would just ignore the problem. I mean, yes I can construct a scenario where we would miss an email during backup but it is very unlikely and since we should have multiple backups, the chance of it missing in all backups is probably less than the chance of the being eaten by a shark. I vote for vanilla rnsapshot.

Thanks for pouring all this time into the problem, I hope you get to profit from the newly learned nix-foo!

About gitter: I have never used gitter before; is this beneficial for a small project like SNM?

phdoerfler commented 6 years ago

Alright. Vanilla rsnapshot it is then. I noticed that dsync does tell you in the end if the export was incomplete because things happened during the backup, so there is at least that. But in the interest of keeping things simple I'll revert back to the simple stupid backup and save the complicated one for a rainy day when I'm really bored (AKA never).

re gitter: I've seen smaller projects than this one use gitter and I believe it is rather simple to set it up. If you don't mind my rather lengthy posts in the issues section then don't bother setting it up though.