YunoHost-Apps / archivist_ynh

GNU General Public License v3.0
13 stars 4 forks source link

Feature request : add compression #28

Closed lapineige closed 3 years ago

lapineige commented 3 years ago

Problem: As Yunohost 4.1 is out now, backups are no longer compressed https://forum.yunohost.org/t/yunohost-4-1-release-sortie-de-yunohost-4-1/13893 We can still compress them afterwards, but it adds some (manual) extra work.

For regular automated backup it can lead to an important storage space use, or would require manual work to fix this (then using this app would no longer be as "install&forget" than it was before).

Proposal : Would it be possible to add an option to compress backups ? I'm willing to help testing it :)

Benefits : this would save a lot of storage space, while not requiring users to regularly think about it and spare some time doing it manually (with CLI).

Additional proposal : use Zstandard compression instead of Gzip to reduce CPU usage and storage space (detailed proposal here https://forum.yunohost.org/t/backups-implementing-zstandard-compression-instead-of-gzip/13379)

maniackcrudelis commented 3 years ago

The app won't work actually if the backup are not compressed... It expects that behavior. I use that app myself, I'll work on it.

Although, it's not in here but in this repo, https://github.com/maniackcrudelis/archivist/blob/master/archivist.sh#L360

maniackcrudelis commented 3 years ago

Had a quick look, so the ability to compress a backup is not an option from the yunohost command line... Nice to see that you have a choice...

You may however still use compressed backup for all your backup with sudo yunohost settings set backup.compress_tar_archives -v true even if not advertised... Anyway...

As the backup produce now an uncompressed tarball, it will be easy for that app to compress it afterward. I'll do it this afternoon, should be easy to do.

lapineige commented 3 years ago

Had a quick look, so the ability to compress a backup is not an option from the yunohost command line... Nice to see that you have a choice..

My proposal was that it would be optional instead of removed. I don't know the reason why this choice was made. Maybe @YunoHost-Apps/apps-group have some information on this ?

If I understand you well here:

You may however still use compressed backup for all your backup with sudo yunohost settings set backup.compress_tar_archives -v true even if not advertised...

If we set that option, it will keep compressing the archives ?

As the backup produce now an uncompressed tarball, it will be easy for that app to compress it afterward.

I would strongly suggest not to compress it as a .gz file (see the proposal linked above), as .zst compression is a lot faster (hence lighter for the CPU) and a bit more storage efficient. But I would understand if dealing with both (legacy) .gz and .zst files is too much of a pain.

maniackcrudelis commented 3 years ago

If we set that option, it will keep compressing the archives ?

Indeed, but for all your backups, and... for how long before it's removed as well...

I would strongly suggest not to compress it as a .gz file

While I'm about to make it optional anyway, I can as well let the user choose which compression to use.

lapineige commented 3 years ago
 While I'm about to make it optional anyway, I can as well let the user choose which compression to use.

Oh, then it's a big :tada: :D

maniackcrudelis commented 3 years ago

Done in here https://github.com/maniackcrudelis/archivist/pull/12

I already tried all different algorithms with both YunoHost and a directory. It works perfectly.

I did not tried though the part with [$ynh_compression_suffix|$files_compression_suffix] at the end.

If you can try it, it would be wonderful :grin:

lapineige commented 3 years ago

Nice idea to document all algorithm pros/cons 👍

Note: xz is faster than lzma

To be honest, I would add a recommendation for zstd in the documentation, but keep gzip as the default (for compatibility reasons).

I did not tried though the part with [$ynh_compression_suffix|$files_compression_suffix] at the end.

I don't get it… what is this supposed to do ?

maniackcrudelis commented 3 years ago

Note: xz is faster than lzma

Sounds logic. I'll add it.

To be honest, I would add a recommendation for zstd in the documentation, but keep gzip as the default (for compatibility reasons).

Know you like it ;) But as explained, zstd is faster than gzip. So clearly better. Yet, I rather let anyone choose for themselves and keep gzip as default so we're sure it works. I really do think it depends on your system, personally I'll probably go with bzip2.

lapineige commented 3 years ago

Yet, I rather let anyone choose for themselves and keep gzip as default so we're sure it works.

It would have been a recommendation for people who don't know much about those algorithms and won't spend time to learn which one is the best for them (it's not easy to understand, time-consuming, and not that interesting after all ^^). It's kind of a mess, so if we could guide some people by letting them know what's a good "general purpose and zero-knowledge" choice.

I really do think it depends on your system

I don't really think so, I think it's more a personal choice depending on what's your preference between speed/CPU load and compression ratio. But between all "fast" algorithm, Zstd is probably the best one as it is lightweight, almost as fast as the fastest ones (and almost as lightweight), but with a much better compression ratio. Compared to Gzip it's 4-5 times faster, which is huge especially for small servers…

maniackcrudelis commented 3 years ago

It does actually depend on your system I think. I wouldn't do the same on a raspi, an old computer or a powerful VPS.

As for me, I'll rather choose bzip2 because I don't care my server to take a lot of time doing it, but I'd rather not stress it too much. Also, I'm confident it wouldn't be too long as I have a big enough processor.

But on a VPS, you may prefer xz, it's effective and you don't really care the VPS will make a lot of noise in the middle on the night because of a compression. You may also want to use lzop if you have mainly a lot of media to backup, it won't bother trying to compress much.

While to choose between gzip and zstd, if you don't know, better to use gzip as it's a standard and when you're about to untar it, it's easier to find out how. Now if you care have a look to the commentary, and dare to change the default one, you would probably choose zstd as it says it's the same but better.

As to guide people, that the purpose of that commentary before choosing which one to use. Now, if you want to add more pro for zstd, we can, as long as we do the same job for all algorithms ;)

maniackcrudelis commented 3 years ago

@lapineige It's done on https://github.com/maniackcrudelis/archivist/pull/12 I already tried it on a virtual machine, it works as expected.

If you have some time to give it a try, would be wonderful before I merge and update the YunoHost app.