borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
11.2k stars 742 forks source link

[Feature Request] Add layers option/notation to `extract`, `mount` and `export-tar` #5425

Open srkunze opened 4 years ago

srkunze commented 4 years ago

What?

Add layers option/notation to extract, mount and export-tar.

Why?

Use case: #5413

Extracting a complete borg archive might sometimes be impossible due to space restrictions (and can take forever). Additionally, updating an existing borg archive by partitioning the repository manually (different archives for different source folders) can be a problem as well. These manually crafted partitions can also grow to unmanageable size and in the end, one needs a management of these as well.

How?

Layering multiple archives can be done by using some external tools like overlayfs but can be cumbersome after doing many partial backups over years.

The idea is to enable the export commands (extract, mount and export-tar) to be able to stack different archives in the order of their creation date.

# archive all the pictures, documents and movies
borg create --list /path/to/repo::myfirst ~/pictures/ ~/documents ~/movies
# after a year, backup 2020 pictures
borg create --list /path/to/repo::mysecond ~/pictures/2020-*
# after a seconds year, we need some of the stuff back
borg mount /path/to/repo::myfirst::mysecond /mnt/point/

Details for mount

Idea: create multiple overlayfs mount points for each additional archive.

Example: 4 archives (a1, a2, a3, a4) in a repository (/path/to/repo):

borg mount /path/to/repo::a1::a2::a3::a4 /mnt/point/

or

borg mount /path/to/repo::a* /mnt/point/

Steps (or all at once if supported by the layering library): 1) mount overlayfs for a1 and a2 -> o1 2) mount overlayfs for o1 and a3 -> o2 3) mount overlayfs for o2 and a3 -> /mnt/point/

Details for umount

borg umount /mnt/point/

Dismantles the layers in reverse order of creation.

Impact

Alternative via option --layers

borg mount --layers a1::a2::a3::a4 /path/to/repo /mnt/point/

or

borg mount --layers a* /path/to/repo /mnt/point/

Bonus: specifying the order of the layers

Order usually should be by creation date of the respective archives. If needed, one could add another option --layer-order using the :: separation syntax to specify order for specific archives.

Remark

overlayfs is just a placeholder. Plug in your favorite layering fs here.

ThomasWaldmann commented 4 years ago

I somehow get the feeling that this is getting way out of scope of what should be in borg (and also of what borg should be used for), adding quite a lot of complexity that is hard to implement and to maintain.

It looks like that this is not about creating a backup of original data, but about managing original data in a rather special way.

backup means that you make a copy from original data to some other location, so that you have it twice (at least).

but as you say you do not have enough space to extract a complete archive, that basically means that you do not have all the original data staying at the original place. thus your borg repo is not a backup, but contains the only copy of parts of the data.

srkunze commented 4 years ago

way out of scope

That's fine. I merely look out for solutions.

backup means that you make a copy from original data to some other location, so that you have it twice (at least).

The borg documentation describes the duplication very much like I planned it to do. I don't see an issue here.

https://borgbackup.readthedocs.io/en/stable/faq.html#can-i-copy-or-synchronize-my-repo-to-another-location

not have enough space to extract a complete archive

That is just because, borg needs this complete archive to create another full archive of the data. That's not my choice; plain rsync works in that scenario by the way. Additionally, exporting everything is time and space consuming, but I think we are running in circles here.

I like pragmatic solutions and updating/merging/replacing parts of an existing archive should be not too much to provide for a backup solution. IMO, this discussion should actually take place in the original use-case issue.