M-Welsch / backup-server

Backup Server (BaSe)
Apache License 2.0
3 stars 1 forks source link

Backup takes extremely long #46

Open M-Welsch opened 5 months ago

M-Welsch commented 5 months ago

Describe the bug

Backup takes longer than expected

Expected behavior

A weekly backup should take a few hours at most (gut feeling).

Actual behavior

Some backups take a whole day.

What happens if we don't solve it (aka why is it important)

long on-times with

To Reproduce

see https://github.com/M-Welsch/backup-server/issues/46#issuecomment-2227293940. Further investigation necesssary

Additional context, Environment

Add any other context about the problem here.

Describe/define the problem

Problem seems to be that full backups are performed instead of increments. BaSe transfers much more data than necessary

Develop Interim Containment Plan (if necessary)

As of now two possiblities:

Determine Root Causes and Escape Points

Pointer to the solution

Actions to prevent recurrence or solve systematic problems

M-Welsch commented 1 month ago

TL;DR

there seems to be a problem with cp -al ... in combination with rsync -avh ...

need to verify

Details

surprise, no further details :)

M-Welsch commented 1 month ago

Summary

rsync preserves timestamps and everything. It seems to copy whole files if the timestamp doesn't match.

Details

$mkdir backup
$rsync_sink cp ~/Base/Literatur/. backup -r
rsync -avh ~/Base/Literatur/. backup --stats  # transfers everything
rsync -avh ~/Base/Literatur/. backup --stats  # transfers nothing

why? Let's see ...

image

after running rsync -avh .. for the first time, the timestamps have been updated. cp -a ... preserves everything. New try:

$cp ~/Base/Literatur/. backup -ra
$rsync -avh ~/Base/Literatur/. backup --stats
Number of regular files transferred: 0
Total file size: 39,25M bytes

Able to reproduce, let's see if that is the actual issue. Running a backup until we reach relevant data and see if there's a mismatch in permissions, user, time or whatever ...

hdd/nextcloud/martha/files/The Thing/2sort/2023-04-22-13-57-47-318.jpg

image

letting rsync run over this directory

image

timestamps have an hour offset, but are otherwise in sync (because server has a different time zone .. didnt fix that)