borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
11.22k stars 743 forks source link

--recompress option is counter intuitive #5154

Closed velleto closed 4 years ago

velleto commented 4 years ago

Have you checked borgbackup docs, FAQ, and open Github issues?: Yes Is this a BUG / ISSUE report or a QUESTION?: Issue/Question Borg version: 1.1.11 (Other information omitted since not relevant to issue)

I appreciate that recreate is an experimental feature and is likely to undergo heavy development until full release (I'm looking forward to the proposed changes in #3631 in particular!). However, the following issue is a UI issue which is perhaps best fixed whist recreate is still marked "experimental" and backwards compatibility is not as important.

Currently the default MODE when --recompressing is never. This is is so counter intuitive that even the documentation gets it wrong:

In the documentation (man borg-recreate) under Examples it currently states that

Create a backup with little but fast compression
$ borg create /mnt/backup::archive /some/files --compression lz4
# Then compress it - this might take longer, but the backup has already completed,
# so no inconsistencies from a long-running backup job.
$ borg recreate /mnt/backup::archive --recompress --compression zlib,9

however, according to discussions in #3617, and the merge #3676 (which reflects current documentation), I believe the second borg command would not recompress the data (since the default MODE is never)? I think the example should be

$ borg recreate /mnt/backup::archive --recompress if-different --compression zlib,9

If I have understood correctly, there is no difference between if-different and always in this context because the previous compression was lz4 and now it is zlib?

To mitigate this, I would suggest that borg should throws an error if no MODE is specified.

In fact, it might be worth going one step further, and removing the never MODE entirely. I'm not sure in which context

$ borg recreate $REPO::$ARCHIVE --recompress never [...]

makes sense? What is the difference between this, and

$ borg recreate $REPO::$ARCHIVE [...]

? Perhaps it to underline the fact that if an archive is recreated without --compression, then the previous compression algorithms and levels are used? If so, I think this should be documented more clearly.

However, I appreciate that, for backwards compatibility (even as an experimental feauture), this may not be feasible. As a compromise, I would create a warning in the style of the experimental feature warning:

recreate is an experimental feature.
WARNING: Currently borg will not perform any recompression even though '--compression' was passed since no MODE was given.
Type 'YES' if you understand this and want to continue: YES

The rationale for the proposed changes is that --- before changes in #3631 are implemented -- it could be a very time-wasting endeavor to recreate --compress --compression L[,C] only to find out ~24 hrs later that no compression took place.

Further, I would stress the importance of the MODE in the first line of the description. That is, change

recompress data chunks according to --compression

to

recompress data chunks according to MODE and --compression

Thank you for your hard work on this project.

(Edited for clarification and wording)

ThomasWaldmann commented 4 years ago

we use these argparse options for --recompress:

nargs='?', default='never', const='if-different',
choices=('never', 'if-different', 'always')

About "const", see there (2nd item in the list, see nargs):

https://docs.python.org/3/library/argparse.html#const

So, default means here the value that is set if --recompress is not given on the commandline and that obviously should be never.

const is the value that is set if only --recompress is given, but that is not followed by a specific mode. if-different is a good value for that because it will only recompress if the original chunk's compression algorithm is different than the desired compression algorithm.

So, this looks correct to me, am I missing something?

ThomasWaldmann commented 4 years ago

So, to summarize: there is nothing wrong, but we could make the docs more clear, so it can not be misunderstood (just document all 4 cases).

ThomasWaldmann commented 4 years ago

@velleto did you see my feedback?

velleto commented 4 years ago

Yes. Thank you for responding so quickly.

I stand corrected: you are absolutely right. I am updating the PR right now