borg info and status with separate keys for monitoring

engelant commented 5 years ago

This is a QEUESTION and/or discussion for ENHANCEMENT

Since the concept of borg is genious I use it for multiple off site backups. That said while deploying the it got me thinking and more and more questions pop up around the web on monitoring and alarting. To date I perform manual checks, but obviously that's no long term solution. Therefor I would like to put out some suggestions, which I even would concider to implement, but only if the collective likes the idea in the first place.

monitoring current backup status
a "get only status" key, for monitoring purposes
get a cached version of borg info (for a specific snapshot)
provide a user selectable redundancy of data

One thing I might mention first: To keep an uncorrupted chain of files it is required to use a RAIDed FS with checksums on ECC memmory (btrfs, zfs, refs if you really insist).

To do so I would suggest to implement writing a kind of log file of each backup process, which contains (maybe even a user selectable subset of available information) in a JSON file (or whatever, I don't really care). Borg info ::my_abckup_last_month is painfully slow, as it really has to look into the backup. This JSON file may itself be written by the backup process and red by any other borg process, so no exclusive lock. In addition that file would be encrypted with AES (or whatever), but not with the repo key, but more like in a PGP manner:

A random key is generated by the "owner backup process"
A file which is also crypted/signed with the repo key contains a list of asymmetric keys, which should only be able to decode the log file
The backup process encrypts the random key with each public key in list and stores it by the pubKey hash
It also creates a special entry, which is the random key encrypted with the repo key -The owner of the according private key can then use his private key to retrieve the symmetric key to decrypt the log file
Besides reading the log these keys would be worthless

That's the basic idea, from a very top level view. File would also need some kind of status in it, and may not be red the moment it's appended to, but I think one might get the generic idea. Please feel free to correct me, if I made some completely wrong assumptions about borg architecture, as I am just assuming the inner workings of borg from my generic knowledge.

ThomasWaldmann commented 5 years ago

not sure why we would want to add such a bunch of complexity and additional code for monitoring.

if you want a log file and a pre-computed borg info, just save the log file when doing the backup and run borg info right after it. some commands already have --json option to produce json output.

store these files and borg's return code on the client, monitor the client and you have all you need.

engelant commented 5 years ago

But won't borg info have to run trough the backup again, even if called right after the backup? Why I want to introduce a layer of complexity? Because it seems appropriate to extend borg in such a manner that one can quickly check the stats of the backups for monitoring purposes, while not having to read the data twice. That additional layer of complexity might be excluded into a seperate program trough a temporary "log file", but on the other hand this application seemed sufficiantly related to the purpose of borg, so it may get included. Also the currently is no "status" pipe, an external application could query, to get the current write and network speed, maybe even overall progress (does borg know data to transfer in advance?). Or is deduplication and change detection made in sync with transfer?

ThomasWaldmann commented 5 years ago

borg info on an archive reads the manifest and the metadata stream of that archive (not: all data) and computes stats based on that. If you do that right after a backup and store the result, you can quickly access it and you don't have to wait for it.

There is a "status pipe": it is the log output on stderr, which can be switched to json format for automated processing. borg does only one pass over all dirs/files. dedup is done on the fly, based on an in-memory hashtable.

wsw70 commented 4 years ago

@ThomasWaldmann

not sure why we would want to add such a bunch of complexity and additional code for monitoring.

In my opinion this is because it is important to know the status of the backups, directly from the backup program. When you check someone's pulse, you do that directly on that person - not by looking at their smartwatch where this is probably registered as well.

What I am trying to say is that the ability to ask a backup system about its status and always get a consistent answer (the status of the completed and possibly ongoing backups) is really important.

Borg is fantastic, don't get me wrong, and it have saved my bottom a few times - but that part (simple ability to be integrated with a monitoring system) would be great.

I get your point about dumping info right right after a backup. It adds complexity but this is what I will do for now.

ThomasWaldmann commented 4 years ago

The easiest way you can already have right now is to monitor the borg client (not the server). Borg gives a returncode which you could use to interface with clientside monitoring.

borgbackup / borg

borg info and status with separate keys for monitoring #4170