garethgeorge / backrest

Backrest is a web UI and orchestrator for restic backup.
GNU General Public License v3.0
787 stars 31 forks source link

Prune, check, and forget operations fail #321

Open gbrsni opened 4 weeks ago

gbrsni commented 4 weeks ago

Describe the bug All prune, check, and forget operations fail with exit status 1. Everything was working until i updated to 1.0 (currently on 1.1.0). This only happens on a pre-existing repository, a freshly created one works flawlessly.

Below is the output of the first failed prune operation

loading all snapshots...
finding data that is still in use for 150 snapshots
[0:44] 100.00%  150 / 150 snapshots

searching used packs...
collecting packs for deletion and repacking
[0:01] 100.00%  12506 / 12506 packs processed

to repack:        244383 blobs / 182.830 MiB
this removes:       6512 blobs / 53.630 MiB
to delete:         24880 blobs / 1.819 GiB
total prune:       31392 blobs / 1.871 GiB
remaining:        733846 blobs / 178.120 GiB
unused size after prune: 7.143 GiB (4.01% of remaining size)

deleting unreferenced packs
[0:00] 100.00%  89 / 89 files deleted

repacking packs
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 509.98213ms: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 698.654091ms: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 1.16352265s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 1.174794481s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 2.288951715s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 3.338556781s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 3.828014434s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 5.085054182s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 17.557967622s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 23.483654633s: The specified key does not exist.

Successive manual prunes fail with exit status 1 and no output.

Manually triggering a check has it failing with exit status 1 with output

using temporary cache in /tmp/restic-check-cache-1031205035

Scheduled prunes after snapshots fail with exit status 1.

Snapshot operations seem to keep working as usual.

Screenshots

Screenshot 2024-06-05 at 14-33-20 Backrest Screenshot 2024-06-05 at 14-33-41 Backrest

Platform Info

Docker compose 2.20.2 on Ubuntu 22.04

ArthurusDent commented 3 weeks ago

I think I might have had a similar problem. The repo in question is on a NAS. Restic-server and backrest are both running on the NAS.

After I updated to backrest 1.0, it seems backrest for the first time changed the owner of some of the files in the repo. Turns out, backrest runs as root (I know, I know...) but the restic-server runs as an unprivileged user. After pruning by backrest, some files were not owned by the unprivileged user any more, but by root. That's when my backups started to fail because some files couldn't be read any more.

Changing the owner of the files in the repo back to the unprivileged user fixed the problem for the time being. But I will probably have to re-install backrest as the same unprivileged user to fix the problem.

garethgeorge commented 3 weeks ago

1.0.0 changed some defaults around how prune operations are scheduled. In many repos a prune and check would be scheduled as soon as updating to 1.0.0 or 1.1.0.

After pruning by backrest, some files were not owned by the unprivileged user any more, but by root. That's when my backups started to fail because some files couldn't be read any more.

This sounds really unexpected, ultimately backrest never touches repo files -- it always coordinates through restic e.g. by running restic commands. I wonder how restic would sometimes be changing the ownership of the files if the repo should be configured to go through your rest repo (backed by restic-serve?)?

gbrsni commented 3 weeks ago

Ok so I run an unlock command from the backrest web ui and then a check which failed so I just left it and went to sleep, and now everything appears to work as intended. Checks still fail tho (below is the output of a check)

using temporary cache in /tmp/restic-check-cache-1374059121
create exclusive lock for repository
load indexes
Load(<index/147e779923>, 0, 0) returned error, retrying after 260.555222ms: The specified key does not exist.
Load(<index/147e779923>, 0, 0) returned error, retrying after 1.041673826s: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 726.824867ms: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 508.156099ms: The specified key does not exist.
Load(<index/147e779923>, 0, 0) returned error, retrying after 1.316104029s: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 503.385832ms: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 618.544189ms: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 1.61691882s: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 1.43519358s: The specified key does not exist.
Load(<index/147e779923>, 0, 0) returned error, retrying after 2.377312594s: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 2.059857878s: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 2.113797971s: The specified key does not exist.
Load(<index/147e779923>, 0, 0) returned error, retrying after 3.470460315s: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 3.24543282s: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 3.632868282s: The specified key does not exist.
Load(<index/147e779923>, 0, 0) returned error, retrying after 2.718768718s: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 5.662324212s: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 3.94220516s: The specified key does not exist.
Load(<index/147e779923>, 0, 0) returned error, retrying after 6.698461587s: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 8.487693142s: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 468.027518ms: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 854.355229ms: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 3.114382809s: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 1.292072873s: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 994.2002ms: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 2.120598121s: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 11.410647778s: The specified key does not exist.
Load(<index/147e779923>, 0, 0) returned error, retrying after 10.314996758s: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 2.173828902s: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 3.344023898s: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 10.824713113s: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 511.11738ms: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 985.601849ms: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 1.131736885s: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 8.570271529s: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 2.528084775s: The specified key does not exist.
Load(<index/147e779923>, 0, 0) returned error, retrying after 17.019453271s: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 2.206457311s: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 9.814884718s: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 3.12064409s: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 15.169052429s: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 282.771296ms: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 621.995866ms: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 14.149184297s: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 1.023419632s: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 7.749732418s: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 1.152495068s: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 2.015925872s: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 2.58081329s: The specified key does not exist.
Load(<index/1aa0a7c22f>, 0, 0) returned error, retrying after 19.696848881s: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 7.085624214s: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 10.369989639s: The specified key does not exist.
Load(<index/147e779923>, 0, 0) returned error, retrying after 21.989391919s: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 6.792021306s: The specified key does not exist.
Load(<index/5280162db4>, 0, 0) returned error, retrying after 24.720925295s: The specified key does not exist.
Load(<index/197cc53225>, 0, 0) returned error, retrying after 12.82633338s: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 10.661027596s: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 7.278225329s: The specified key does not exist.
Load(<index/9c261d12b6>, 0, 0) returned error, retrying after 10.088924567s: The specified key does not exist.
Load(<index/7bc6992497>, 0, 0) returned error, retrying after 22.758534265s: The specified key does not exist.
error: error loading index 1aa0a7c22fea80c823bd62ff1d0ed6afbc6ee60c390b9e1df4101cf412b6db8b: The specified key does not exist.
error: error loading index 197cc53225b681bae8a5eb013666bbd0207cc18eb18d9d81784858ea6699007c: The specified key does not exist.
error: error loading index 147e77992369c4c6a390325d0e0409a1ef40fa65333d16f5ec353d1a1a7aa0bf: The specified key does not exist.
error: error loading index 9c261d12b6968a021d5023ff5431e04f3ee60a5e25f38bf6ee025f52b6d687e3: The specified key does not exist.
error: error loading index 5280162db416aa0ee65f73267d896afd30ebc96550e31991e43a379c62733242: The specified key does not exist.
error: error loading index 7bc69924973f8dec5d75a2f3e8f1b61af64d3e72cf79bce70dc90f062ea281be: The specified key does not exist.
Fatal: LoadIndex returned errors
garethgeorge commented 3 weeks ago

You might try running a repair operation: https://restic.readthedocs.io/en/latest/077_troubleshooting.html#repair-the-index .

gbrsni commented 3 weeks ago

I did that and checks now pass, but prune still fails

loading indexes...
loading all snapshots...
finding data that is still in use for 149 snapshots
[0:11] 100.00%  149 / 149 snapshots

searching used packs...
collecting packs for deletion and repacking
[0:02] 100.00%  13822 / 13822 packs processed

to repack:        395000 blobs / 264.223 MiB
this removes:     229478 blobs / 150.156 MiB
to delete:        185410 blobs / 9.791 GiB
total prune:      414888 blobs / 9.938 GiB
remaining:        648709 blobs / 176.987 GiB
unused size after prune: 9.490 GiB (5.36% of remaining size)

repacking packs
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 615.791183ms: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 867.051225ms: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 977.05203ms: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 1.497812046s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 3.185125692s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 5.412823522s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 4.91874376s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 11.470026971s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 13.879214381s: The specified key does not exist.
Load(<data/4935444ef6>, 201818, 61426) returned error, retrying after 10.003962552s: The specified key does not exist.
ArthurusDent commented 3 weeks ago

I wonder how restic would sometimes be changing the ownership of the files if the repo should be configured to go through your rest repo (backed by restic-serve?)?

I've thought about this more and realized that the repo in backrest isn't configured to be accessed through the restic-server. It is configured just like a regular repo by giving it the directory because the repo is on the same machine. But this alone doesn't explain the changed ownership because there is no plan on that machine that uses the repo in question. So what might have happened is that I had the "great" idea to run prune manually on that repo from that machine. This would most likely have executed the restic binary as the user root, which would then explain the change in ownership, right?