freedomofpress / dangerzone

Take potentially dangerous PDFs, office documents, or images and convert them to safe PDFs
https://dangerzone.rocks/
GNU Affero General Public License v3.0
3.39k stars 155 forks source link

Consider switching from gzip to lzma #663

Open apyrgio opened 6 months ago

apyrgio commented 6 months ago

Consider changing the compression algorithm from Gzip to LZMA for our container image. Currently, our 1.3GiB container image gets compressed down to ~620 MiB with Gzip. With LZMA, we can further reduce it to ~500MiB:

LZMA compression (default level) -> 503.9MiB in 7 minutes

lzma --keep --verbose share/container.tar
share/container.tar (1/1)
  100 %     503.9 MiB / 1,273.2 MiB = 0.396   3.0 MiB/s       7:00

LZMA compression (best level) -> 495.4MiB in 9 minutes

lzma --keep --verbose --best share/container.tar
share/container.tar (1/1)
  100 %     495.4 MiB / 1,273.2 MiB = 0.389   2.4 MiB/s       9:00

LZMA decompression in 24 seconds

unlzma --keep --verbose share/container.tar.lzma
share/container.tar.lzma (1/1)
  100 %     503.9 MiB / 1,273.2 MiB = 0.396    52 MiB/s       0:24

This ties well with the discussion regarding switching to a container image that can further reduce the download size (see https://github.com/freedomofpress/dangerzone/issues/658#issuecomment-1861239821).