deajan / backup-bench

Quick and dirty backup tool benchmark with reproducible results
BSD 3-Clause "New" or "Revised" License
112 stars 10 forks source link

speed #5

Open ThomasWaldmann opened 1 year ago

ThomasWaldmann commented 1 year ago

... depends on a lot of things and might be hard to compare.

just a few insights (from borg development):

So, sometimes speed == quick & dirty and slower == better / safer.

The less you do, the faster you get. The question is then if you still do enough / all that is needed.

deajan commented 1 year ago

I can definitly add the file descriptor part to the README section.

I don't get the part where borg >= 1.2 checks if a file has changed while it was backed up. Does it the backed up file to it's last state while doing backups ? What if the file continuously changes ?

So, sometimes speed == quick & dirty and slower == better / safer.

This statement is something I can live by, except for parked files like qcow files with external snapshots, which will never change while being backed up (actual thing I do with borg as of today).

ThomasWaldmann commented 1 year ago

The "changed while backup" only detects that there might be a problem, it does not avoid it (like a snapshot).

In some cases, it might be not an issue (like e.g. a log file growing a line at the end), but in other cases it might warn the user of an issue (e.g. if you backup some sort of database and the file changes internally while you back it up - the file as read by borg could then be inconsistent internally).

deajan commented 1 year ago

Thanks for the clarification. This let's me think of pre-freeze and post-thaw scripts for databases ;)

I'll add a "backup coherence" entry in the table which I can link to this discussion.

Jsut a side question, when using borg cli, will there be a specific exit code in those cases, or must the output be parsed to find out whether a file changed while being backed up ?

ThomasWaldmann commented 1 year ago

Currently there are only a few exit codes and also it is hard to map warnings to exit codes (because there can be multiple different warnings), so one currently needs to read the log output.

deajan commented 1 year ago

I added a new benchmark with qemu disk images (see last README.md file) Noticed that borg performs quite well for that usecase, whearas backing up the linux kernel source files is not that great in terms of speed. Is that explained by your above statements about open() and stat() ?

ThomasWaldmann commented 1 year ago

Could be, because if you have a lot of small files, the per-file overhead has a much bigger effect than for few big files.