openhab / openhabian

openHABian - empowering the smart home, for Raspberry Pi and Debian systems
https://community.openhab.org/t/13379
ISC License
822 stars 252 forks source link

Prepare useful software for backups #3

Closed ThomDietrich closed 6 years ago

ThomDietrich commented 8 years ago

(etckeeper is installed to protect against accidental deletion of files. (Check if period needs to be increased))

A real backup-to-external strategy is needed as SD cards are known to crash at some point.

sihui62 commented 7 years ago

Which software?

Why not using something readily available and is known to be working? https://www.linux-tips-and-tricks.de/en/backup

ThomDietrich commented 7 years ago

I already looked into that and it looks quite nice. The only decision missing is the backup location. An external USB stick or drive is probably the best option for users. A normal user might not own a second server. Samba access to the users desktop PC might be another option.

Are you actually using raspiBackup? Can you tell a little bit more about your setup, configuration, experiences and recommendations?

sihui62 commented 7 years ago

An external USB stick or drive is probably the best option for users

Any mountable drive can be used as backup medium. Take a look under "Features"

Are you actually using raspiBackup?

Yes. And because of two corrupted sd cards I had to use the restore function twice and ... it worked ;-) In my case I'm using one Pi as DLNA and NAS server (nfs share) with attached USB HDD, raspiBackup is backing up this Pi on the attached harddrive. My second Pi with openHAB1 is backed up via nfs mount to that first Pi using rsync (incremental backups save a lot of time). My third Pi is a Pi V3 with attached USB SSD (no sd card at all), this one (openHAB2 testing) is also backed up via nfs mount to the first Pi's harddisk. This feature was implemented a couple of weeks ago after the developer of raspiBackup was asked for it. He is always willing to implement new features if he is asked for it. Another benefit could be the ability to use conf files ( /usr/local/etc/raspiBackup.conf) for configuration purposes. I'm not a programmer, but maybe it is possible to give openHABian users the choice between different backup mediums through this. Don't hesitate to contact the developer if you need more information (he speaks native german). https://github.com/framps

cribskip commented 7 years ago

Hi, I'd like to advocate for borgbackup (https://borgbackup.readthedocs.io/en/stable/). I'm using it successfully in my home network for several nodes and things, f.e. the openhab config.

Newest version even includes "versioning", e.g. show all versions of a file.

Cheers, Sascha

mstormi commented 7 years ago

Step back a second and think of the whole picture. If your SD card breaks, you need to reinstall openHABian, too. This is not as simple as the initial installation run and can take a lot of time, particularly since most people will have lots of non-openHAB, yet smart home related configs such as mosquitto, self-written scripts and various customizations that they accumulated over time. What a smart home user needs is two types of backup. The first one would need to create a full raw device level copy of the SD card (or other medium, whatever you're using in your server) so in case of a crash, one would be able to get the home back up running as fast as possible, including the latest instances of all the non-openHAB customization stuff. My recommendation would be attach the spare hardware (anyone should have spares of all hardware components, including the Pi and all in-use USB controllers or Pi addon cards available anyway) and to run a simple raw device copy. I, for instance, run my Pi off its SD card, so I attached a USB card writer, and all I needed was to put an identically-sized SD card in, and use dd bs=4M if=/dev/mmcblk0 of=/dev/sdc to create a SD copy of my running OS image.

The other backup method would merely be a long-term archive system. It should include at least the ability to schedule backups, to do versioning, have a proper UI and reporting and ideally the ability to backup devices other than the openHABian server itself, too. For this purpose, I'd recommend to use Amanda. It's a professional (yet free, open source), highly flexible, proven system. I've been operating it almost 20 years ago to backup all of our university's network, and when I just checked because of this issue here I was happy to see it's still alive and maintained. There's a Raspian package available, too. It's not restricted to Pi hardware or debian/Raspian/openHABian, there's also a Windows client available, so there's a fair chance you can use it to backup your Windows based KNX gateway or even your desktop PCs, too. I guess that since I know it can do raw device dumps (such as that of the Pi's internal SD card /dev/mmcblk0) and can restore to user-specifiable devices, too, it should be possible to also use it to create the SD card copies for emergency recovery as I mentioned above where I'm still using dd for. Get me a couple of rainy days to have a closer look...

maschenborn commented 7 years ago

Never heard of amanda. Sounds interesting. I'm using openHABian on Pi3 moved to an USB-Stick via the openhabian-function. For Items, rules etc. I like a git repository instead of a backup - then it's easier for me to check changes. But if the Stick or the SD-card crashes, I only have a month-old manual created backup.

How could an amanda workflow look like? Installing Amanda on the same Pi3 as OH? Then connecting a HDD or another USB-Stick? Then to backup the SD-Card and the stick each night? I fear that restoring will be much more effort for me than to restart with a fresh openhabian-environment (even if I have installed and changed a lot of settings...)

mstormi commented 7 years ago

A git repository is nice, but we need to offer backup for users with offline servers, too, for those that want to be independent of cloud providers and internet connectivity. Yes, one would install Amanda on the Pi (openHABian server, that is) and attach dedicated storage. You can use any sort of storage with Amanda. USB HDD, NAS mount, USB stick, whatever, even AWS cloud storage. It's up to the user, and - with one exception - we (as openHABian) shouldn't mandate specific HW. The exception I mentioned is that we need a SD card writer USB-attached to our server, to be able to restore the OS image to another SD card. That card writer you will have already and will need to have anyway, in one way or another. Amanda was originally built to run on removable tape media, so you could use a set of USB sticks or other removable media and Amanda takes care of inventory tracking. You need to 'label' removable media (tapes or sticks or cards) and in the evening before the backup is scheduled to run, Amanda checks via cron whether the right one is put in (USB attached) . It'll eventually send you a mail reminder to plug in the right one. My proposal for a default system would be to permanently attach that card writer to the openHABian server and to then have a set of SD cards in rotation. Amanda handles them like removable tapes. You can use cards of any size as long as the cards in the set are equal in size. If needed, Amanda distributes larger dumps (such as the internal SD card's one) across multiple media. From Amanda perspective it's just a device, so any user could insert USB sticks instead as well, however there's risks associated with that like the device name can change or the stick is being used for other purposes than backup (can happen, too, but is less probable with a SD card). Backups are initiated nightly via cron, or less frequent, or manually as often or seldom as you like. It's usually doing level 1 dumps only, i.e. only files that changed since the last backup run are copied. In effect, you will just need a cumulated amount of SD card storage maybe 2 to 3 times the size of your system card. You can specify the maximum cycle until another level 0 dump (copy everything) is forced and Amanda will automatically do the scheduling for you. One month will probably be a reasonable period. The default (meant to optimize backing up multiple clients) is to have a 'holding disk' as a dump cache, so multiple clients can be backed up in parallel there before they're written to tape/SD card. I guess I'll disable that as a openHABian install default because it would require us to have a intensively-used temporary storage area, and we should minimize writing to SD cards to avoid running into corruption issues. Anyone to have another USB or NAS (like me) storage area can still activate the feature. Report via mail. Restoring is done with a command line tool. If you tell it the file or partition to restore, it presents you with a list of all backup generations of that file or partition, and you can specify which one and where to restore it to.

As I wrote, we need a two-fold approach. In addition to a file-based storage mechanism, the raw SD card/partitions should be backed up . In need, Amanda can then be used to restore the complete (most recent) OS image to a new SD card, including all current custom programs and settings. That's a lot faster than a fresh install, and even more important, it'll include ALL of your modifications. A openHABian re-flash is rather quick, but it can take several hours if not weeks to identify(!) all the parts of the system that you modified and re-install and configure all those mods/addons.

The only thing we need to take of is that we have a working SD clone of our openHABian/Amanda server... a backup of the backup server, that is.

ThomDietrich commented 7 years ago

Hey guys, I'm also using git to track and store my configuration. I'd agree that this is not the ideal solution for unattended backups. Another simple solution is to use a synchronization service as described in #113

I'd separate two use cases.

  1. Backup of the complete SD card
  2. Backup of a selected set of important user files

The first use case could for example be solved with RaspiBackup mentioned by @sihui62. I am personally not interested in this solution but can totally understand the value of it to others. RaspiBackup earned my mistrust in the beginning because it's only aimed at the oh-so-special RPi :smile: but it actually looks quite good! I'll give it a go in the next couple of days.

Regarding the second use case I want to mention Duplicati. It provides a web frontend and is very flexible to the end users needs. I can see it as a "hassle-free" backup solution. I've been using the version 1 on PCs of relatives and friends for a few years now and will give version 2 a try in the next couple of days. https://www.duplicati.com - https://alternativeto.net/software/duplicati/?license=opensource&platform=linux

I'll have to look at the other solutions. In the end it would be great to have one solution with all the discussed functions and benefits.

@mstormi Amanda looks amazing... as an enterprise backup solution. My first impression is that it is not easy and intuitive to allow "hassle-free" modifications by the end user. That should be the primary goal here! @cribskip I fear the same goes for Borg. Wdyt?

mstormi commented 7 years ago

Well, I partly agree that Amanda doesn't look as 'sexy' as the other, simpler options at first glance. To setup Amanda isn't all that easy and intuitive, but the main reason for that is that it's so flexible. But since we (well, me, I'd be doing that) will create a standard setup as part of openHABian, that will not be any user's problem. And if a user still wants to tailor it to his needs (more files, different storage media, compression, more clients, ....), he can do so rather easily if he is starting from the standard setup. To add another (own) directory to be backed up is as easy as adding one more line to a file.

Then again, once up and running, the user interface is simple and straightforward, and it's fully automated in daily operation, thus absolutely hassle-free. That's what it was really built for.

Unlike the other options, Amanda has been in so widespread use for such a long time that it has become a very mature, reliable system. And all the documentation and know how is available in the Wiki and mailing list archives, ranging from simple end user FAQs to expert level troubleshooting support.

mstormi commented 7 years ago

Thomas, which format do you expect contributions in ? README, an install.sh oder debian package ? I have Amanda working on my openhab Pi now and documented the steps - well, more or less, to be verified by you and others :) As you can have multiple configs and choose which one(s) to use at backup runtime, I thought I create one config to work on SD cards, one on share basis (NAS like my own HW setup, or permanent USB-attached storage), and one based on AWS S3 cloud service (Amazon offers 5 GB free, so even the SD card dump could fit :)). They all allow for backing up/restoring the raw SD card contents and individual dirs/files. So a user can use it to create a SD card clone as well as for longterm archiving/selective restore of openHAB and private files. I believe this should cover most use cases, shouldn't it ? (Don't worry, it's still really simple for a user to extend these configs if he feels he needs to.)

Either way, independent of the software solution, what we need is a list of dirs/files that we should backup by default, to include all relevant system and openHAB files. As you probably have the best overview what packages openHABian adds and where it puts the config files, could you provide an initial list ?

mstormi commented 7 years ago

@ThomDietrich have you read my previous post?

ThomDietrich commented 7 years ago

Hey @mstormi, now I did :) All sounds good to me.

You are talking about raw SD card backups and specific folders/files backups. Are you going to address both with Amanda? openHAB folders to back up are only /etc/openhab2 and /var/lib/openhab2. Additionally configurations and data of the optional components (if installed) would be logical. If you prepare the basic configuration I can add the needed paths later.

You can contribute your finished solution as a PR or post the solution as a step by step instruction set (which I would need to convert to a PR).

If you are going to build a PR, here are a few hints:

mstormi commented 7 years ago

ok, will try building the PR. Will be quite a bit of work so will take some time. Can I assume we have a working mail system ? Is there any known reuseable admin mail address at this stage ?

ThomDietrich commented 7 years ago

There is no mail system configured nor is one prepared. We've discussed this in the past and decided against it. Please leave notifications out for now.

I thought about adding openHAB items for system status information but didn't yet do that. An item could also be used to inform about the last successful backup and such. This should be part of a separate PR

mstormi commented 7 years ago

Argh, things are getting complicated. I've got the directory (NAS mount or permanent USB attached storage) and AWS S3 based variants working but fail on the local SD one, and I feel that's the most important one because it's the only one to work in private without connectivity and cloud services and without a need to dive into Linux (mount from NAS or create filesystem on USB storage etc).

Currently it seems we need a more current version of the Raspian Amanda package for this to work. (I contacted the maintainer but no answer yet). How do you think we should proceed ? Omit that SD option for now ? Write a big README to explain all the filesystem basics (not an easy task at all, and still very prone to user side errors !) ? Build a current package ourselves (big effort and risk of running out of sync) ?

PS: all of these variants will allow for both, backing up files as well as the raw SD card. I'll disable backing up the raw SD for S3, though, as that would require way too much bandwidth.

ThomDietrich commented 7 years ago

Hey, in favor of getting a first version ready for review and testing I'd leave the SD card based backup out for now. I'd see the option to backup to a filesystem path (HDD, NAS, ...) as the basic requirement. Every user is able to flash the backup SD card image if necessary. I personally see no big advantage in the direct SD card replication. What's your opinion on that?

Regarding the readme: Please keep it as short as possible and include links to relevant online articles. You/We will not win a price for describing how to mount a USB drive for the 1000st time.

mstormi commented 7 years ago

Well I still feel the SD backup to be the most favorable option because S3 requires connectivity+cloud storage service, and the filesystem path first of all requires you to have a HDD, NAS or USB stick and to know how to mount it. But I'll leave it out for now. Will readdress it when there's news such as Raspian Amanda package update.

mstormi commented 7 years ago

Here you are.

Try to get it to work and let me know where you encounter issues. I'm sure there will be some :)

metbril commented 7 years ago

Just curious. How can one create a raw sd card copy of a running system with open and constantly changing files? Like my influxdb database?

ThomDietrich commented 7 years ago

It's always recommended to stop services before taking a backup. Most systems will survive these live backups however, similarly to sudden power losses. A database for example implements mechanisms to roll back unfinished transactions. That said, just like with power loss, avoid if possible.

Btw: I personally would always prefer the more specific/selective approach: https://docs.influxdata.com/influxdb/v1.2/administration/backup_and_restore

@mstormi I'll come back to you shortly!

mstormi commented 7 years ago

Yes just as @ThomDietrich says, most systems will survive this, including those restored from a backup taken with an active DB under load. While it's not complete nonsense, it's more of myth with its origin in the early IT days (remember Sybase? ...) than still a real problem nowadays.

It never hurts to add a separate database backup cycle and if I did run Influx, I'd do as mentioned in the previous post, but specifically as part of the Amanda setup I will not care about eventually running databases. Still, if you have a script to automate that, drop us a note, maybe we can integrate it.

mstormi commented 6 years ago

Do we still need this issue or can we close it now that Amanda is available in openHABian ?

ThomDietrich commented 6 years ago

I guess not. Everyone: If an aspect mentioned in this issue is still worth a discussion, please open a new issue!