borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
10.97k stars 741 forks source link

`sudo borg`: extracted directory and file: lose original owner, group and file permissions #7504

Closed lamyergeier closed 1 year ago

lamyergeier commented 1 year ago

Have you checked borgbackup docs, FAQ, and open GitHub issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

BUG

System information. For client/server mode post info for both machines.

Your borg version (borg -V). borg 2.0.0b4

Operating system (distribution) and version. Ubunutu 22.10

Hardware / network configuration, and filesystems used. Ext4 filesystem on Laptop and on External harddrive

How much data is handled by borg? About 500 GB

Full borg commandline that lead to the problem (leave away excludes and passwords)

sudo borg mount ... sudo borg extract ....

Describe the problem you're observing.

I am doing a full system backup, so I am doing it as a root user. When I backup as a root user sudo borg , the archive contains files and folders with same user and owner and file permissions as original file.

Mount

  1. When I mount it, I use sudo borg. The archive, that was mounted shows the files and folders with correct permissions and owners.
  2. But if I try to copy and paste a particular folder recursively from mounted archive to somewhere else. All copied files and folders lose their owner, group and permissions and show root as owner and group, with permissions changed!

Extract

Same problem (as with mount, above) using sudo borg to extract. All extracted files and folders lose their owner, group and permissions and show root as owner and group, with permissions changed!

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

Described above

Example

I extracted the folder home/lamy/Documents/Book/Dictionary from a particular archive using sudo borg to current path. Note the owner , group and file permissions. Note that directories have owner and group as root, with file permissions changed!

$ ls -la
drwx------ 3 root root     4096 Apr  6 23:52  home

root@lamy:/home/lamy/Downloads/.Temp# cd home
root@lamy:/home/lamy/Downloads/.Temp/home# ls -la
total 12
drwx------ 3 root root 4096 Apr  6 23:52 .
drwxrwxr-x 3 lamy lamy 4096 Apr  6 23:52 ..
drwx------ 3 root root 4096 Apr  6 23:52 lamy

root@lamy:/home/lamy/Downloads/.Temp/home# cd lamy
root@lamy:/home/lamy/Downloads/.Temp/home/lamy# ls -la
total 12
drwx------ 3 root root 4096 Apr  6 23:52 .
drwx------ 3 root root 4096 Apr  6 23:52 ..
drwx------ 3 root root 4096 Apr  6 23:52 Documents

root@lamy:/home/lamy/Downloads/.Temp/home/lamy# cd Documents/Book/Dictionary/
root@lamy:/home/lamy/Downloads/.Temp/home/lamy/Documents/Book/Dictionary# ls -la
total 40408
drwxr-xr-x 2 lamy lamy     4096 Nov 18  2021 .
drwx------ 3 root root     4096 Apr  6 23:52 ..
-rw-r--r-- 1 lamy lamy 20136636 Nov  9  2021 Duden_Deutsches_Universalwörterbuch.azw
-rw-r--r-- 1 lamy lamy 21226685 Nov  9  2021 Oxford_Dictionary_of_English.azw
$ \ls -la /home | sed -n -e '/lamy/p'
drwxr-xr-x 47 lamy   lamy    4096 Apr  7 00:25 lamy

$ \ls -la /home/lamy | sed -n -e '/Document/p'
drwxr-xr-x   10 lamy lamy   4096 Mar 29 09:21 Documents

$ \ls -la /home/lamy/Documents/Book/Dictionary 
total 40408
drwxr-xr-x 2 lamy lamy     4096 Nov 18  2021 .
drwxr-xr-x 5 lamy lamy     4096 Mar 30 20:20 ..
-rw-r--r-- 1 lamy lamy 20136636 Nov  9  2021 Duden_Deutsches_Universalwörterbuch.azw
-rw-r--r-- 1 lamy lamy 21226685 Nov  9  2021 Oxford_Dictionary_of_English.azw

Question

How to preserve original owner, group, permissions and all other attribute (metadata) while extracting as root?

ThomasWaldmann commented 1 year ago

The "problem" is that you only extracted the Dictonary folder (and everything inside it).

Its parent dirs were ad-hoc created by borg (not extracted) and thus have default permissions.

If you want something to be like in the archive, you need to extract it.

lamyergeier commented 1 year ago

@ThomasWaldmann So even if we need only a particular directory from a particular archive, the only way to preserve permissions and other metadata like owner and user is to extract the entire archive and not a particular path from it?

If this is the case, that's very bad as the entire archive is more than 500 GB in size! I suggest that we can implement the way Backintime solves this issue when used as a root user. I am not sure but what i remember is that Backintime maintains a database of metadata of each file and directory, and uses it to restore.

Any suggestions?

ThomasWaldmann commented 1 year ago

If you only extract that one directory, I guess it means you have everything else still? So why do you need the parent dirs anyway?

lamyergeier commented 1 year ago

@ThomasWaldmann Let me illustrate with another example as previous example does not contain any subdirectory inside the path that we want to extract:

I am extracting home/lamy/Documents/Mobile/Git/Notes path from a particular repository at the current PWD=/home/lamy/Downloads/.Temp.

root@lamy:/home/lamy/Downloads/.Temp/home/lamy/Documents/Mobile/Git# ls -la
total 12
drwx------ 3 root root 4096 Apr  7 01:56 .
drwx------ 3 root root 4096 Apr  7 01:56 ..
drwxr-xr-x 8 lamy lamy 4096 Mar 27 13:41 Notes
root@lamy:/home/lamy/Downloads/.Temp/home/lamy/Documents/Mobile/Git# ls -la Notes/
total 328
drwxr-xr-x 8 lamy lamy   4096 Mar 27 13:41 .
drwx------ 3 root root   4096 Apr  7 01:56 ..
drwxr-xr-x 2 lamy lamy   4096 Nov  6 13:33 bin
-rw-r--r-- 1 lamy lamy    212 Jul 19  2022 CHANGELOG.md
-rw-r--r-- 1 lamy lamy     69 Jul 19  2022 commitlint.config.js
drwxr-xr-x 7 lamy lamy   4096 Mar 28 23:42 .git
-rw-r--r-- 1 lamy lamy   2805 Dec 13 10:57 .gitignore
-rw-r--r-- 1 lamy lamy    324 Jul 19  2022 .gitlab-ci.yml
-rw-r--r-- 1 lamy lamy     18 Jul 19  2022 .gitmessage
-rw-r--r-- 1 lamy lamy      0 Jul 19  2022 .gitmodules
drwxr-xr-x 3 lamy lamy   4096 Dec 18 13:07 .GitSettings
drwxr-xr-x 9 lamy lamy   4096 Oct 29 15:07 NotesSrc
-rw-r--r-- 1 lamy lamy    982 Jul 19  2022 package.json
-rw-r--r-- 1 lamy lamy 274403 Jul 19  2022 package-lock.json
-rw-r--r-- 1 lamy lamy     58 Jul 19  2022 .prettierignore
drwxr-xr-x 3 lamy lamy   4096 Jul 19  2022 Readme
lrwxrwxrwx 1 lamy lamy     16 Jul 19  2022 README.md -> Readme/Readme.md
drwxr-xr-x 2 lamy lamy   4096 Mar 21 08:46 .Ws

Till now it looks all good:

All subdirectories and files have lamy as owner and group.

Since, the ad-hoc directory have root as owner. I have to do sudo to copy paste it to another location (to get rid of adhoc folders).

root@lamy:/home/lamy/Downloads/.Temp/home/lamy/Documents/Mobile/Git# cp -r Notes /home/lamy/Downloads/.Temp/
root@lamy:/home/lamy/Downloads/.Temp/home/lamy/Documents/Mobile/Git# exit
exit
$ ls -la
total 16K
drwxrwxr-x 4 lamy 4.0K Apr  7 01:58 ./
drwxr-xr-x 8 lamy 4.0K Apr  6 23:13 ../
drwxr-xr-x 8 root 4.0K Apr  7 01:58 Notes/
drwx------ 3 root 4.0K Apr  7 01:56 home/
-rw-rw-r-- 1 lamy    0 Apr  4 15:32 .nobackup
$ cd Notes/
/home/lamy/Downloads/.Temp/Notes
$ ls -la
total 328K
drwxr-xr-x 8 root 4.0K Apr  7 01:58 ./
drwxrwxr-x 4 lamy 4.0K Apr  7 01:58 ../
drwxr-xr-x 3 root 4.0K Apr  7 01:58 .GitSettings/
drwxr-xr-x 2 root 4.0K Apr  7 01:58 .Ws/
drwxr-xr-x 7 root 4.0K Apr  7 01:58 .git/
drwxr-xr-x 9 root 4.0K Apr  7 01:58 NotesSrc/
drwxr-xr-x 3 root 4.0K Apr  7 01:58 Readme/
drwxr-xr-x 2 root 4.0K Apr  7 01:58 bin/
-rw-r--r-- 1 root 2.8K Apr  7 01:58 .gitignore
-rw-r--r-- 1 root  324 Apr  7 01:58 .gitlab-ci.yml
-rw-r--r-- 1 root   18 Apr  7 01:58 .gitmessage
-rw-r--r-- 1 root    0 Apr  7 01:58 .gitmodules
-rw-r--r-- 1 root   58 Apr  7 01:58 .prettierignore
-rw-r--r-- 1 root  212 Apr  7 01:58 CHANGELOG.md
lrwxrwxrwx 1 root   16 Apr  7 01:58 README.md -> Readme/Readme.md
-rw-r--r-- 1 root   69 Apr  7 01:58 commitlint.config.js
-rw-r--r-- 1 root 268K Apr  7 01:58 package-lock.json
-rw-r--r-- 1 root  982 Apr  7 01:58 package.json

Proposed solutions

How about not create adhoc directories or create it with the same permissions and metadata as original?

ThomasWaldmann commented 1 year ago

Rather use mv (if possible) or rsync -aH or cp -a.

lamyergeier commented 1 year ago

@ThomasWaldmann That works and it restores files with original metadata. But may be we could provide a CLI option that doesn't generates the ad-hoc directories. Becasue the /home is always root as owner and group. And this requires unnecessary (redundant) extra steps:

  1. Change to root user using su as sudo doesn't work with bash inbuilt commands like cd. This is necessary as ad-hoc directory are root owned with permission drwx------
  2. Have double the amount of space. To move already extracted data, one needs to have more free space.

Suggestion: May be when PATH are specified (refer: borg extract — Borg - Deduplicating Archiver 2.0.0b4 documentation) we may not create ad-hoc folders or may be we give another option to not to create those ad-hoc folders.


Apart from this, I am not sure what's the point of creating ad-hoc directories with different owners, permissions and other metadata, given that its not possible to merge. May be ad-hoc directories are useful with the same metadata as original.

ThomasWaldmann commented 1 year ago

The ad-hoc created directories are needed for be able to restore the item with its full (relative) path.

You can't restore parent/child without having parent/ first.

If you restore to same fs, mv does not need any addtl. space and also is very quick.

That the directories are created ad-hoc and not extracted has to do with pattern matching. If you give parent/child on the command line, that pattern will not match any item when sequentially processing the archive (esp. not parent) until it reaches parent/child. Then, when trying to extract that (child), it notices it does not have the parent dir in the fs yet and ad-hoc creates it.

lamyergeier commented 1 year ago

May be could you provide an option (like --skip-ad-hoc ) to avoid creating the ad-hoc to avoid

ThomasWaldmann commented 1 year ago

At the moment of extracting parent/child, sequential processing has already passed by parent, so it does not have the correct metadata at hand.

Guess the only solution possible would be to keep a cache of directory metadata in memory, so it can be looked up at ad-hoc creation time.

jdchristensen commented 1 year ago

As @ThomasWaldmann said, the only real issue here is that you used cp, which does not by default preserve permissions and ownership. You should use mv, which preserves those, is fast, and requires no extra space.

In addition, you can use the --strip-components <n> option to remove the specified number of leading path elements, if you don't want the whole path to be restored.

lamyergeier commented 1 year ago

@jdchristensen

Is it possible to avoid the following: Change to root user using su. This is necessary as ad-hoc directory are root owned with permission drwx------ followed by additional step of mv. Or is it possible to have adhoc directories with original permission or permission like drwxr-xr-x instead of drw------?

Not sure what --strip-components does. If it doesn't create parent directories, is it possible to automate it by counting the number of parent directories and strip all parent directories, may be with another new option like --skip-ad-hoc ?

jdchristensen commented 1 year ago

If the borg repository is owned by root and is a local repository, you'll have to access it as root. (If it is a remote repository, then the user reading the repo and the user writing the extracted files can be different.) I'll let @ThomasWaldmann decide whether it's worth having an option that automatically strips all parent directories. Since I usually extract to a temporary location, a mv is going to be needed anyways, so I see no big advantage.

lamyergeier commented 1 year ago

@jdchristensen

Guess the only solution possible would be to keep a cache of directory metadata in memory, so it can be looked up at ad-hoc creation time.

The creation of ad-hoc directory makes sense only if the metadata are same as originals.

Also, its just not about mv, as adhoc directories are created with permissions drwx------ which requires su, while the extraction and mounting can be done using sudo without needing the root password.

An easy stopgap solution would be to create ad-hoc directory with permission drwxr-xr-x. This will make it easy to move files without having to login as root!

ThomasWaldmann commented 1 year ago

iirc, borg does just a mkdir for the adhoc dir creation.

The rest (owner, group, mode) is likely controlled by active user and umask.

jdchristensen commented 1 year ago

@lamyergeier You can use sudo -i to get a shell with root privileges, so no need to use su. Or you can do sudo mv /path/to/extracted/folder /desired/location/.

Jamie-Landeg-Jones commented 1 year ago

When you extract a subtree from an archive, it's usual practice that the parent directories are not touched if they exist, and if they don't exist, are created adhoc using default settings for the extracting user.

This is how "tar" behaves, and presumably most other solutions.

Why? It could potentially be dangerous otherwise - think about a large archive, and you want to extract one particular subtree. You are explicitly asking for this subdirectory. You don't expect anything outside your specified tree to be altered.

Imagine the following scenario:

An admin of a box discovers a previous admin had made a stupid mistake, and had made the root directory writeable by everyone! Oops! Anyway, he fixes it, checks the system, no damage is done.

A week later, he is asked by Fred to restore his "docs" directory from 2 week ago.

he duly goes and does this:

cd / borg extract ::archive home/user/johnny/docs

It would be natural to assume that only files under home/user/johnny/docs would be affected, but if it worked the way you want it, borg would have just restored the file permissions of the root directory, making them writeable again to everyone.. Oops again!

Having said that... There IS a solution to your problem, but you need to modify the syntax of your extract command.

I suppose it would be possible to add to borg a "--restore-metadata-for-all-parent-paths" option that would effectively just modify the command syntax internally, but that's not my call :-)

Anyway, you need to specify each parent directory individually with a "--pattern" rule. Here's an example.

Note, in the following example, I restore the "normal" method into "/tmp/method-1" and the alternative method into "/tmp/method-2".

Then note the directory owner/group/date/flags/perms - comparing method-1, method-2, and the original.

Is this the kind of result you want?

08:07 (39.0°C 400) (95) "locks" root@thompson# cd /tmp

08:07 (39.0°C 400) (96) "/tmp" root@thompson# mkdir method-1 method-2

08:07 (39.0°C 400) (97) "/tmp" root@thompson# cd method-1/

08:08 (39.0°C 400) (99) "method-1" root@thompson# /usr/bin/time borg extract ::thompson.20230410.2334.dump usr/users/jamie/files/locks/
        8.37 real         7.11 user         0.25 sys

08:08 (39.0°C 400) (100) "method-1" root@thompson# cd ../method-2

08:09 (39.0°C 400) (101) "method-2" root@thompson# /usr/bin/time borg extract --pattern='+re:^usr$' --pattern='+re:^usr/users$' --pattern='+re:^usr/users/jamie$' --pattern
='+re:^usr/users/jamie/files$' --pattern='+pp:usr/users/jamie/files/locks' --pattern='-pp:usr/' ::thompson.20230410.2334.dump .
        9.18 real         7.79 user         0.37 sys

08:10 (39.0°C 400) (102) "method-2" root@thompson# cd ..

08:10 (39.0°C 400) (103) "/tmp" root@thompson# l /tmp/method-1/usr/users/jamie/files/locks/ /tmp/method-2/usr/users/jamie/files/locks/ /usr/users/jamie/files/locks/
/tmp/method-1/usr/users/jamie/files/locks/:
total 0
0 -rw-rw----  1 jamie  jamie  -   0  3 Oct  2017 _bookmarks.lock
0 -rw-rw----  1 jamie  jamie  -   0  2 Oct  2017 _deletes.lock
0 -rw-rw----  1 jamie  jamie  -   0  2 Oct  2017 _preferences.lock
0 -rw-rw----  1 jamie  jamie  -   0  2 Oct  2017 _timestamps.lock
0 drwxrwx---  2 jamie  jamie  - 256  3 Oct  2017 ./
0 drwx------  3 root   wheel  -  64 11 Apr 08:08 ../

/tmp/method-2/usr/users/jamie/files/locks/:
total 0
0 -rw-rw----  1 jamie  jamie  -   0  3 Oct  2017 _bookmarks.lock
0 -rw-rw----  1 jamie  jamie  -   0  2 Oct  2017 _deletes.lock
0 -rw-rw----  1 jamie  jamie  -   0  2 Oct  2017 _preferences.lock
0 -rw-rw----  1 jamie  jamie  -   0  2 Oct  2017 _timestamps.lock
0 drwxrwx---  2 jamie  jamie  - 256  3 Oct  2017 ./
0 drwxrwx---  3 jamie  jamie  -  64  4 Oct  2017 ../

/usr/users/jamie/files/locks/:
total 8
0 -rw-rw----  1 jamie  jamie  -   0  3 Oct  2017 _bookmarks.lock
0 -rw-rw----  1 jamie  jamie  -   0  2 Oct  2017 _deletes.lock
0 -rw-rw----  1 jamie  jamie  -   0  2 Oct  2017 _preferences.lock
0 -rw-rw----  1 jamie  jamie  -   0  2 Oct  2017 _timestamps.lock
4 drwxrwx---  2 jamie  jamie  - 512  3 Oct  2017 ./
4 drwxrwx---  6 jamie  jamie  - 512  4 Oct  2017 ../

08:10 (39.0°C 400) (104) "/tmp" root@thompson# pathinfo method-1/usr/users/jamie/files/locks/ method-2/usr/users/jamie/files/locks/ /usr/users/jamie/files/locks

0 drwxrwx---   2 jamie  jamie  -     256  3 Oct  2017 /tmp/method-1/usr/users/jamie/files/locks
0 drwx------   3 root   wheel  -      64 11 Apr 08:08 /tmp/method-1/usr/users/jamie/files
0 drwx------   3 root   wheel  -      64 11 Apr 08:08 /tmp/method-1/usr/users/jamie
0 drwx------   3 root   wheel  -      64 11 Apr 08:08 /tmp/method-1/usr/users
0 drwx------   3 root   wheel  -      64 11 Apr 08:08 /tmp/method-1/usr
0 drwxr-x---   3 root   wheel  -      64 11 Apr 08:08 /tmp/method-1
0 drwxrwxrwt   5 root   wheel  -     192 11 Apr 08:09 /tmp
4 drwxr-xr-x  36 root   wheel  schg 1024  5 Feb 14:24 /

0 drwxrwx---   2 jamie  jamie  -       256  3 Oct  2017 /tmp/method-2/usr/users/jamie/files/locks
0 drwxrwx---   3 jamie  jamie  -        64  4 Oct  2017 /tmp/method-2/usr/users/jamie/files
0 drwx------   3 jamie  jamie  -        64 10 Apr 23:24 /tmp/method-2/usr/users/jamie
0 drwxr-xr-x   3 root   wheel  -        64 19 Aug  2020 /tmp/method-2/usr/users
0 drwxr-xr-x   3 root   wheel  sunlnk   64  6 May  2022 /tmp/method-2/usr
0 drwxr-xr-x   3 root   wheel  schg     64  5 Feb 14:24 /tmp/method-2
0 drwxrwxrwt   5 root   wheel  -       192 11 Apr 08:09 /tmp
4 drwxr-xr-x  36 root   wheel  schg   1024  5 Feb 14:24 /

4 drwxrwx---   2 jamie  jamie  -       512  3 Oct  2017 /usr/users/jamie/files/locks
4 drwxrwx---   6 jamie  jamie  -       512  4 Oct  2017 /usr/users/jamie/files
4 drwx------  56 jamie  jamie  -      3072 11 Apr 07:12 /usr/users/jamie
4 drwxr-xr-x   4 root   wheel  -       512 19 Aug  2020 /usr/users
4 drwxr-xr-x  24 root   wheel  sunlnk  512  6 May  2022 /usr
4 drwxr-xr-x  36 root   wheel  schg   1024  5 Feb 14:24 /
ThomasWaldmann commented 1 year ago

I am closing this: