Open robbat2 opened 7 years ago
In duplicacy_chunk.go it looks like the preferences file takes a "compression-level" value. sort of goes like... -1 default zlib compression 0 no compression 9 best zlib 100 LZ4
But I haven't tried it.
Also in the docs it says that the default compression level is -1, but might be 100(LZ4)
Prior to version 1.2 you can set the compression level (using the standard zlib numbers 0-9 or -1) when initializing the storage. However, after version 1.2 I decided to switch to LZ4 for compression and blake2 for hash (instead of SHA256), mostly for performance. Therefore, a somewhat arbitrary level of 100 is used to indicate the use of both LZ4 and blake2. And I natively believed that LZ4 is so much faster that there would be no need for other options therefore the compression level option was removed.
Obviously I was wrong and the compression level option should be added back to the init command. The good news is, it is super easy to introduce new compression algorithms (for instance it was just a few lines of code to support LZ4).
Please suggest the compression algorithms that you think should be supported (besides the no compression option).
LZ4 is fine for the cases that I do want compression, but I can see people that might want something like Snappy for bounded time in compression.
I'd like a high compression one. Like lzma, xz. When we back up to these cloud services, they will charge us $ per month to keep things there. This will add up to a lot after a few years. And some cloud storage services also have high costs for downloading your data.
Thanks
+1 for control of compression. I'm using a raspberry pi and an external drive at an offsite location with particularly fast upload to seed my home backups, as pushing them through my home connection from scratch would take about 18 months. A lot of it is already compressed or encoded in one way or another, and I'm backing up to Backblaze B2, which is ultra-cheap. I'd rather have the time/throughput performance than save a few megs here and there with compression.
To dev: zstd --long
seems to work better in terms of speed/compression, a possible golden mean?
To cloud junkies: free plan usually covers the preservation of essential bits, the rest is a reluctance to sort.
zstd support would be great to have.
@gilbertchen any updates on the plans here?
any modern compression algo is smart enough to switch to store only any non-compressable stream automatically.
lz4 used in this project does it here:
https://github.com/bkaradzic/go-lz4/blob/7224d8d8f27ef618c0a95f1ae69dbb0488abc33a/writer.go#L138
there is no golang native port of zstd - seems like a poor idea to link the C library
It seems there is now a native go implementation of zstd: https://github.com/klauspost/compress/tree/master/zstd
One of the sets of content I need to back up is already maximally compressed with XZ, and it makes no sense to try further compressing the chunks with LZ4.
The snapshot data should record the compression format (if any) of the chunks, and permit compression to be entirely optional. This also provides future-proofing for the next great compression breakthrough, and re-compressing existing backups.