borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
11.09k stars 738 forks source link

practical testing of what will become borg 1.2 #4360

Closed ThomasWaldmann closed 2 years ago

ThomasWaldmann commented 5 years ago

do practical testing with master branch code (or some alpha release).

report anything you find on the issue tracker (as a separate issue), check first if the issue already has been reported. you can post "works for me" in this ticket here.

see end of this ticket for current betas / rc (or directly check the github releases page).

do not run master branch code or borg pre-releases against production repos, always use fresh repos made by borg init!

fantasya-pbem commented 5 years ago

I have tested master branch (borg 1.2.0a3.dev3+g81f9a8cc) with empty repo on davfs2.

Everything worked perfectly. 83 segments total.

                       Original size      Compressed size    Deduplicated size
All archives:               41.16 GB             32.78 GB             31.31 GB
                       Unique chunks         Total chunks
Chunk index:                  194392               251139

davfs cache_size: 5000 MiB / davfs table_size: 1024

works for me

ThomasWaldmann commented 5 years ago

alpha 3 is out. \o/

fantasya-pbem commented 5 years ago

Tested latest master branch (borg 1.2.0a3.dev8+gde151cd3) with my 31.31 GB repo from yesterday on davfs2.

No errors occurred. 87 segments total.

                       Original size      Compressed size    Deduplicated size
All archives:               81.83 GB             65.08 GB             31.85 GB
                       Unique chunks         Total chunks
Chunk index:                  195405               501636

Interesting: davfs didn't respect cache_size of 5 GB, all segments were cached until cache was 30 GB. When borg exited (rc 0), cache was cleaned to < 5 GB, which is expected behaviour when open files are closed.

works for me

ThomasWaldmann commented 5 years ago

@fantasya-pbem the age threshold of the new lrucache is 4min, so it would be interesting to test >>4min. Also, the code is avoiding to use old file handles, but otoh does not actively search for them and close them.

fantasya-pbem commented 5 years ago

I did several more tests in the last two days, with a production repository I used for half a year until some days ago:

First some tests with 5 GB limited DavFS cache (which is on its own LVM volume that had ~45 GB free space).

As expected, borg 1.2.0a3.dev8+gde151cd3 crashed when DavFS cache was full (45 GB). After that I tested a patched borg 1.1.10.dev5+g7a7e04a4.d20190227 from the 1.1-maint branch with deactivated LRU fd cache. borg check was fine and DavFS cache was constantly at 5 GB until segment 131 when it crashed with KeyError / Input/Output error.

Therefore I increased DavFS cache to 25 GB limit.

borg 1.1.10.dev5+g7a7e04a4.d20190227 check then finished without errors.

Now I tested borg 1.2.0a3.dev8+gde151cd3 again. DavFS deleted the "metadata files" from its cache but no segment files when the 25 GB threshold was exceeded. BUT – when cache reached 40 GB (that was when segment file 100 was downloaded) DavFS started to delete segments (seg 1 first, then 2, ...). This was 31 minutes after I started the check. Cache stayed at 40–41 GB until segment 182 (first seg of 2nd backup in repo). DavFS now deleted a bunch of files fast and could manage to hold 25 GB until end of borg check.

This is a very interesting behaviour, but I cannot see a plausible explaination why DavFS cache growth stopped at 41 GB after 31 minutes. I can only draw the conclusion that one needs a reasonably large DavFS cache for borg checks, but I does not need to be as big as the borg repo.

ThomasWaldmann commented 5 years ago

@fantasya-pbem i opened #4427 for further work on the lrucache. if you like to help with testing, add a comment there.

ThomasWaldmann commented 5 years ago

Just released 1.2.0 alpha4 to pypi.

fantasya-pbem commented 5 years ago

Just for protocol: Tested borg-1.2.0a5 today. Works for me.

ThomasWaldmann commented 5 years ago

@fantasya-pbem did you also test some of the new features (see changelog)?

fantasya-pbem commented 5 years ago

I did a "borg compact --cleanup-commits" which did what it should. I ran a "borg check" afterwards which succeeded too. (Test repo has 5 archives. Will do more tests later with a production repo copy.)

I did search for information about the borg check "--max-duration". It seems that this option is not explained in detail anywhere. I'm wondering what would be good and bad max duration values. There should be a section in the docs where this option is discussed.

ThomasWaldmann commented 5 years ago

--max-duration should be listed with a brief description in the borg create reference docs.

Good values depend a bit on your repo size and also on speeds and overheads. But I guess usually one would take enough time to make a decent amount of progress related to the overall time needed to check the whole repo.

E.g. if whole repo would take 100h to check, you could:

fantasya-pbem commented 5 years ago

I tested "borg compact", "borg prune" and "borg check" with my big repo on DavFS. Compact and prune worked well (prune's applied-rule output is nice!). Check did crash after 90 segments, but I think this was because of small DavFS cache (5 G) which may be too less for the network throughput of my server and Borg's FD timeout. With 10 G cache borg check did succeed. I had such problems before occasionally with 5 G. So again – 1.2.0a5 works for me.

Regarding --max-duration: It is mentioned in "borg check" with one sentence: "do only a partial repo check for max. SECONDS seconds (Default: unlimited)". There is no further explaination why and when this option could be used. I assume that instead of one full long-running check Borg can run multiple partial checks that last --max-duration until the whole repo is checked. But this fact doesn't seem to be expressed with this clarity in the docs. It should be mentioned in borg check.

ThomasWaldmann commented 5 years ago

@fantasya-pbem thanks for testing, created #4473 for the docs issue.

ThomasWaldmann commented 5 years ago

1.2.0a6 was just released - please test it: https://github.com/borgbackup/borg/releases/tag/1.2.0a6

This time the first time including pyinstaller-made fat binaries for Linux, FreeBSD and macOS, please test them (see the README about the binaries).

christophlehmann commented 5 years ago

I'm very happy to see the new compact command.

Server-Version 1.1.9 Client-Version: borg-linux64-1.2.0a6

I ran borg compact $BORG_REPO on client. it ended ~1 sec later and exit code was 0. The compact command is not available in 1.1.9 so i expect an error message and rc>0

christophlehmann commented 5 years ago

I ran borg-linux64-1.2.0a6 compact on the server. The server had no repokey. It shrinked the repository from 287 GB to 25 which seems reasonable. Then i ran a backup with borg-linux64-1.1.10 which went fine. An extract with borg-linux64-1.1.10 on the oldest backup also went fine.

ThomasWaldmann commented 5 years ago

@christophlehmann borg compact is executed on the server (even if you start it on the client).

Servers < 1.2 do not know compact, so yes, guess that should error.

Thanks for testing stuff!

ThomasWaldmann commented 5 years ago

I ran borg compact $BORG_REPO on client. it ended ~1 sec later and exit code was 0. The compact command is not available in 1.1.9 so i expect an error message and rc>0

I checked this and (slightly counter-intuitive though), the "OK" rc (rc == 0) is correct.

This is because borg < 1.2 always did the compaction when committing the repo (so basically for most repo-writing / committing commands), so it did what you told it to do although < 1.2 does not support the separate compact command (which is rather about not doing the compaction in all other commands, but only doing it in the compact command).

How long it takes depends on the amount of work to do, so it can be quick if there isn't much to compact or slow if there is a lot.

One exception is of course if the repo is in append-only mode, then borg won't do compaction.

ntova commented 5 years ago

Just finished testing my most common commands with version 1.2.0a6: check, compact, create, delete, diff, extract, list, prune, recreate against a copy of my main 110G repository which is by now about 1 1/2 years old and is used daily. Everything works as expected, no errors, and no huge changes performance-wise except:

The one thing I noticed, is that computing the stats now seems to take a bit longer, at least when using recreate.

As an example: 800MB repository with 158 archives on a Raspberry Pi 4, recreate over all archives to exclude a folder:

recreate times borg 1.1.10 borg 1.2.0a6
with stats 6:45 min 12:00 min + 4sec compact
w/o stats 6:50 min 3:47 min + 4sec compact

I know that computing the stats during recreate is unnecessary, I sometimes just like to see the numbers shrink ^^ The missing compact after each archive really shows, I like the separate compact behavior :)

ThomasWaldmann commented 5 years ago

That recreate --stats performance regression looks like an interesting thing to profile and find the root cause.

"borg info" and "--stats" had some bugs in 1.1, so some things are done differently in 1.2 to give correct results now, but having twice the runtime seems a bit much.

ntova commented 5 years ago

I just checked, the relevant commit is 61b9283567b18863ef66fd624c61299da0f130fd which makes sense. All archives in my test were pre 1.2, therefore the 1.2 metadata is not available and --stats is slow. A second run of recreate gives better performance.

WARNING: same repo, but new machine, so times are not comparable to my previous comment:

version time for recreate (min:sec) with --stats
1.1.10 3:28
1.2.0a6, old metadata 4:16
1.2.0a6, new metadata / second run 2:40
1.2 before 61b9283567b18863ef66fd624c61299da0f130fd 2:15
szpak commented 5 years ago

@ThomasWaldmann Do you plan to release a7 anytime soon or it's better to build Borg from master to give it a try?

ThomasWaldmann commented 5 years ago

Yes, I should release a new alpha (and first look at / update the change log to see how much changed sinced last alpha). Hopefully in next few weeks.

szpak commented 5 years ago

I've seen updated changes, thanks. Are they some automatically generated binary snapshot versions (for Linux) generated from CI (I wasn't able to find any)? Or I need to setup the Python developer environment do build one (a wait "next few weeks")?

ThomasWaldmann commented 5 years ago

I don't trust cloud services enough to build distribution binaries there.

I am still waiting for the cooler for the 2nd cpu for my machine to arrive (thanks to DELL to use a non-standard cpu cooler mount), then I'll run the next vagrant tests / builds there.

szpak commented 5 years ago

I was thinking just about snapshots for non-critical stuff (testing). However, potentially malicious binaries could compromise the whole system, so I can find reasoning for it. No need to hurry. In the meantime I have found instructions how to use vagrant, do probably I will be able to generate a statically linked Linux binary for my own purpose.

ThomasWaldmann commented 5 years ago

Good news: coolers arrived, dual-xeon workstation now happily powering borg testing with 16C/32T. \o/

So, next alpha "coming soon".

ThomasWaldmann commented 5 years ago

1.2.0a7 released just now! please test. https://github.com/borgbackup/borg/releases/tag/1.2.0a7

bket commented 5 years ago

Just started testing 1.2.0a7 on OpenBSD. So far no surprises when:

I really like that the first ctrl-c makes a checkpoint and then aborts.

Also, regression tests run successfully.

bket commented 5 years ago

A newer version of msgpack-0.6.2 is available, which sooner or later is gonna hit OpenBSD's (and other flavours) ports tree. I tested 1.2.0a7 with msgpack-0.6.2, which seems to work ok.

After reading #3757 I understand that I need to be carful. Are there specific tests that need to be conducted before a pull request is even considered?

infectormp commented 5 years ago

@bket here some info about msgpack compatibilities https://github.com/borgbackup/borg/issues/4220

ThomasWaldmann commented 4 years ago

The 0.6.1 -> 0.6.2 changes are not many / big and also do not look problematic:

https://github.com/msgpack/msgpack-python/commits/master

So guess we can just accept 0.6.2 as compatible until proven else.

Guess the bundled msgpack should also get upgraded to 0.6.2 (only in master branch, 1.1-maint branch is still at 0.5.x) as it contains necessary changes for bundling now in the upstream src.

henfri commented 4 years ago

Hello,

I would like to test the windows version. Would you be able to provide a compiled windows version as well? If not, instructions in the Readme_windows would be appreciated on how to use the non compiled version (no need to start at installing python ;-)

Regards, Hendrik

henfri commented 4 years ago

Hm neither?

ThomasWaldmann commented 4 years ago

@henfri ask @jrast about the current windows work.

henfri commented 4 years ago

Hello Thomas, I thought I was doing that. Do you mean in a different issue?

Greetings Hendrik

ThomasWaldmann commented 4 years ago

Mentioning him might be enough to notify him.

jrast commented 4 years ago

Hey @henfri I'm currently working (or was working last month) on CI for Windows, so there are some builds at appveyor which should work: https://ci.appveyor.com/project/jrast/borg/build/artifacts

Not that a wheel and a exe is provided. The wheel is expected to be installable with pip without any further dependencies, the exe should be a single file executable which does not require a python installation on your machine.

The CI work is done in this PR: https://github.com/borgbackup/borg/pull/4733

skyegecko commented 4 years ago

Just leaving a comment to note that running a 1.2.0a7 client under WSL seems to work without having to invoke the workarounds env variable. I've only run a create and check so far, are there any other behaviours that are worth giving a poke?

Client info

Server info

Used commands

ThomasWaldmann commented 4 years ago

@dutchgecko nobody needs create, everybody needs extract (just joking).

Also: create some archives, delete or prune some, run a full repo check after some operations.

theChaosCoder commented 4 years ago

@jrast I just tried your borg.exe. Simply calling borg.exe throws this error:

Traceback (most recent call last):
  File "borg\__main__.py", line 14, in <module>
  File "c:\projects\borg\borg-env\lib\site-packages\PyInstaller\loader\pyimod03_importers.py", line 627, in exec_module
  File "borg\__init__.py", line 5, in <module>
ModuleNotFoundError: No module named 'borg._version'
[14168] Failed to execute script __main__
ThomasWaldmann commented 4 years ago

@theChaosCoder @jrast that is a file usually generated by setuptools_scm (see setup_requires in setup.py). It cares for creating a version from git tags and also for the manifest.

jrast commented 4 years ago

@theChaosCoder can you provide the link to the exact *.exe which you have downloaded and used? This seems to be related to the build process.

theChaosCoder commented 4 years ago

I clicked on https://ci.appveyor.com/project/jrast/borg/build/artifacts and there's only one borg.exe Direct link: https://ci.appveyor.com/api/buildjobs/x169rd0r1db5xh24/artifacts/dist%2Fborg.exe

jrast commented 4 years ago

@theChaosCoder can you try again? I rebased the my repo on the current master which triggered a new build. On my machine the exe did startup as expected.

Note: Please use the Artifacts from this build to test: https://ci.appveyor.com/project/jrast/borg/builds/30084299.

theChaosCoder commented 4 years ago

Yep, this one starts with no errors.

szpak commented 4 years ago

TL;TR. I've been playing with 1.2.0a7 for a while and I haven't encountered any regression.

I have been backing up ~10GB of various data (several archives to deal with different retention) locally (to an another HDD), side by side (two separate repos) with 1.1.10 for a few weeks. Occasionally I was comparing content (bytes and basic metadata) of the mounted archive with the original source filed (or the other backup). There were no issues with data corruption. The archives were slightly smaller in 1.2 (auto,zstd), but for 2 of them the output was bigger (however, it was stable).

The biggest improvement was of course in backuping up to devices with slower random access time (pendrive, sshfs) thanks to relaxed need to do compacting after every archive (tested with smaller data set). With multiple archives I achieved 4-6 times shorter backup time. It's a killer features for those environments!

m3nu commented 4 years ago

Also added the latest 1.2.0a7 as option to BorgBase.com for testing. Just choose that version in the repo options under advanced.

I'll update it whenever I see a new release on Github.


theChaosCoder commented 4 years ago

On Windows10 x64 I tried to make a test backup with the latest win build https://ci.appveyor.com/project/jrast/borg/builds/32112933/artifacts

.\borg.exe -V
borg.exe 1.2.0a7.dev276+g8e855ecb
.\borg.exe create  D:\repotest::abc D:\myfolder\                                                                      Traceback (most recent call last):
  File "borg\__main__.py", line 15, in <module>
  File "borg\archiver.py", line 4492, in main
  File "borg\archiver.py", line 4297, in get_args
  File "borg\archiver.py", line 4343, in parse_args
  File "argparse.py", line 1755, in parse_args
  File "argparse.py", line 1787, in parse_known_args
  File "argparse.py", line 1996, in _parse_known_args
  File "argparse.py", line 1952, in consume_positionals
  File "argparse.py", line 1861, in take_action
  File "argparse.py", line 1158, in __call__
  File "argparse.py", line 1787, in parse_known_args
  File "argparse.py", line 1996, in _parse_known_args
  File "argparse.py", line 1952, in consume_positionals
  File "argparse.py", line 1845, in take_action
  File "argparse.py", line 2375, in _get_values
  File "argparse.py", line 2408, in _get_value
  File "borg\helpers\parseformat.py", line 503, in validator
  File "borg\helpers\parseformat.py", line 390, in __init__
  File "borg\helpers\parseformat.py", line 396, in parse
  File "borg\helpers\parseformat.py", line 421, in _parse
NameError: name 'path' is not defined
[60140] Failed to execute script __main__
ThomasWaldmann commented 4 years ago

@theChaosCoder Maybe you want to test #4862 by @jrast ?