Open vt-alt opened 2 years ago
Hello,
Yes I am a bit stalled at the moment, due to lack of time, and I am the only developer. I intend to keep working on burp when I get some time.
Thank you for the suggestions. I don't think implementing zstd is as simple as you might think. It requires parallel threads, which I think would basically require rewriting most of the internals of burp. And that wouldn't help if you had multiple clients backing up at the same time. Actually - which part of the backup are you talking about here - phase2 or something else?
Some ideas for two of the speed issues above, if you are not doing these already:
If you have lots of small files to back up, you might want to turn off librsync (set librsync=0).
For faster restores, you might want to try using hardlinked_archive=1. Backups that are hardlinked means that the restore doesn't have to apply any diffs when it comes to restoring a file, so it can just feed the bytes straight off the disk. You can see which backups are already hardlinked by standing in the client's storage directory on the server and doing an 'ls */hardlinked'.
Thanks for the reply and suggestions!
phase2 or something else?
Yes, where file transfer occurs.
Yes, where file transfer occurs.
Do you see the 100% cpu on the client, or server, or both?
I think if this is Windows clients, it can face Windows Task Scheduler reduced priority issue. Please look at https://aavtech.site/2018/01/windows-task-scheduler-changing-task-priority/ After some update, Windows changed default task priority.
One more thing @vt-alt, while using rsync library for large files, low CPU and network usage can be seen on both client and server, while it is in progress of finding differences, especially for large files. So if there is already duplicate data, it is not sent, as well it is not processed. In some cases it is faster to set rsync library file size cut off in config file.
Not to blame, but list of weakness of burp we sometimes getting. (Btw it seems development is stalled?)
Recently I wanted to restore package database for several days like this:
It's ~400M, but one restore taking about a hour. Plus, when I wanted to relaunch command with
time
I cannot re-run restore quickly, because of repository lock and I should still wait a hour when server process finishes. Inability to parallel restore is bad.