borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
10.73k stars 733 forks source link

Backups much slower (5 mins compared to 0.3 secs) than reported "Duration" -- any way to speed it up? #8220

Closed QUser534 closed 1 month ago

QUser534 commented 1 month ago

Have you checked borgbackup docs, FAQ, and open GitHub issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

Maybe bug?

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

borg 1.2.8

Operating system (distribution) and version.

fedora 39

Hardware / network configuration, and filesystems used.

ext4, 300MBPS

How much data is handled by borg?

90 GB

Describe the problem you're observing.

Backups are taking about 5 minutes. However, the reported "duration" of the backup is 0.3 seconds. As a result, there is about a 5 minutes unexplained delay. You can see the archive is only dealing with 106 files and only 1 KB of data, so it should be very fast.

This is important because about 20 separate archives get created per backup usually with very little increment data change. This unexplained 5 minute delay per archive causes about 1.5 hours of additional run time per day for borg. So if there is a way to get the actual backup time closer to the 0.3 second reported time, then it will mean borg does not have to run for hours a day.

The program first outputs the following:

Creating archive at "borgbase.com:repo::ARCHIVE" Remote: Storage quota: 90 GB out of 1.00 TB used. Remote: Storage quota: 90 GB out of 1.00 TB used.

Then about 5 minutes passes and it will then output the following:

Repository: ssh://borgbase.com:repo::ARCHIVE/./repo Archive name: ARCHIVE Archive fingerprint: .... Time (start): Tue, 2024-05-12 10:08:11 Time (end): Tue, 2024-05-12 10:08:11 Duration: 0.32 seconds Number of files: 106 Utilization of max. archive size: 0%

Original size Compressed size Deduplicated size This archive: 39.91 MB 15.76 MB 968 B All archives: 4.15 TB 3.27 TB 65.81 GB

Unique chunks Total chunks Chunk index: 258208 8512701

The backup is created with the following command options:

--compression zlib,6 --exclude-caches --exclude-nodump --exclude-if-present '.ignore' --keep-exclude-tags --files-cache ctime,size --one-file-system --stats

infectormp commented 1 month ago

Can you reproduce this behavior using a local repository, i.e., not using the network?

ThomasWaldmann commented 1 month ago

Possible explanations (IIRC):

If you add the --list option, it might get clearer: the time while it lists files is the time that counts for the backup time.

ThomasWaldmann commented 1 month ago

Also, --stats might make it slower.

QUser534 commented 1 month ago

Thanks for the quick response. I have added the --list option and I have also reviewed the FAQ. Neither has indicated what this 5 minute overhead could be especially considering the amount of data being backed up in this run is < 1 kilobyte and the total universe of files is 106 and only 39MB.

There is only one machine making the backup. Considering the entire dataset can be transferred several times over SSH within the 5 minutes why does is it taking 5 minutes to transfer less than the one kilobyte incremental change?

Results of --list command:

196 files / directories listed relatively quickly. Almost all had the flags d/U. 3 had the flag "s". 1 had the flag "A". 5 had the flag "x"

The delay seems to be after the files list.

The following environment variables are set which guarantee that there is no file cache issue in addition these tests are run directly after each other so no TTL issue:

BORG_FILES_CACHE_SUFFIX="segment-a"
BORG_FILES_CACHE_TTL="3"

I assume that borg must be downloading something from the repository and this occurs after the file list is generated because that happens relatively quickly. What could borg be doing to the repository that takes 5 minutes when the dataset is only 106 files and 39MB?

I have verified that within the borg directory exists a chunk directory, so that should have a local cache. Are there any other ways to see where borg is spending it's time?

EDIT: total non-incremental dataset size is actually < 39MB because that includes excluded data. Probably more like 20MB.

ThomasWaldmann commented 1 month ago

After the files processing is when it does the stats stuff. Does it get faster without --stats?

As the stats in borg 1.x include overall repo / all archives stats, this might take longer if there are a lot of archives and/or if the repo is big.

QUser534 commented 1 month ago

Thanks @ThomasWaldmann --stats was the culprit. Backup time went from 300 seconds to 9 seconds!