AppImage / appimagetool

A low-level tool to generate an AppImage from an existing AppDir
75 stars 13 forks source link

Regression? higher startup time with zstd compression #64

Closed Samueru-sama closed 1 month ago

Samueru-sama commented 1 month ago

I've been testing this appimagetool now that the static runtime is used automatically. I like that I can select the zstd compression level to use, however during testing I've noticed that appimages made with this appimagetool take longer to launch than appimages made with gzip compression.

For these benchmarks I've used the zen-browser appimage (the specific build, which targets x86_64 v3).

Using appimagetool I made two appimages, one using zstd level 22 compression (the highest):

./appimagetool-x86_64.AppImage --comp zstd --mksquashfs-opt -Xcompression-level --mksquashfs-opt 22 squashfs-root which results in an appimage size of 73.7 MiB

And another using level 1:

./appimagetool-x86_64.AppImage --comp zstd --mksquashfs-opt -Xcompression-level --mksquashfs-opt 1 squashfs-root which results in an appimage size of 96.0 MiB

And test it how long it took for each appimage to start, using this hacky script which gets when the application launches by checking the window class in i3wm: https://pastebin.com/qH7DwcQv

The appimages were all on placed on RAM before being launched to prevent the ssd causing any issue, here are the results:

Screenshot: https://imgur.com/xzMirgX.png

First this is how long it takes for the original appimage to start after 4 passes:

Time taken: 2359 miliseconds
Time taken: 2626 miliseconds
Time taken: 2257 miliseconds
Time taken: 2755 miliseconds

This is how long it takes for the zstd-22 appimage to start, which it is expected to be higher since it is using a very high compression level:

Time taken: 3534 miliseconds
Time taken: 3571 miliseconds
Time taken: 3688 miliseconds
Time taken: 3767 miliseconds

This is how long it takes with zstd-1, odd that it still takes longer than the original appimage:

Time taken: 2975 miliseconds
Time taken: 3116 miliseconds
Time taken: 2962 miliseconds
Time taken: 3130 miliseconds

Why I think that there is a regression, is that if we make the zstd squashfs image directly, and then add it to the runtime (bypass appimagetool all together) then, it only takes ~1.6 seconds to start the appimage as expected.

Time taken: 1603 miliseconds
Time taken: 1619 miliseconds
Time taken: 1596 miliseconds
Time taken: 1632 miliseconds

That "manual" appimage was made using mksquashfs ./squashfs-root Manual-zstd-1 -comp zstd -Xcompression-level 1 and then using the runtime from here I appended it with cat Manual-zstd-1 >> runtime-x86_64

I don't think this is an issue with the runtime, I don't know if the issue is that appimagetool is using different block sizes when making the squashfs image, there is a 1.4 MiB size difference between the manual-zstd1 appimage and the zstd1 appimage made with appimagetool.

Samueru-sama commented 1 month ago

The issue is that appimagetool sets the block size to 1MB here

What's the default block size of mksquashfs? EDIT: It is 128K, and manually setting it that way fixes the slow startup time as well.

I tested manually making a manual appimage this time passing the argument -b 1MB to mksquashfs and now the resulting appimage has the issue that it takes longer to start.

Here is a test showing the two, the difference in the startup time is quite big:

image

What's interesting is that the -b 1M appimage is 1.4 MiB bigger than the same appimage without giving the block size, even though the comment in the code says that it could produce a smaller appimage.

This mentions the following:

squashfs filesystems are made using mksquashfs. -b specifies the block size. 131072 (128k) is the default size. A larger block size will usually result in slightly better compression, but the read speed can be worse. The default block size is a good choice.

Samueru-sama commented 1 month ago

I have also tested some different block sizes, for some reason 4M block fails to be made even though it should be possible?

512K takes 2.3s on average to start. 64K is the same as default (128K), at most there is a 50ms difference in favor of 64K.

@mgord9518 Sorry for the ping, you are the author of the original switch to 1MB block size, what do you think of my findings?

If anyone wants the appimages I've made to test just let me know and I'll link them, There is also these builds of appimagetool that use the default 128K blocksize to make comparisons.

mgord9518 commented 1 month ago

@Samueru-sama Interesting. Yeah it is true that a larger block size can increase start time, but it should almost always improve compression ratio. I don't believe 4MiB is yet possible, SquashFS currently has a hardcoded 1MiB max but they're talking about upping it in the future. The format has the capability for higher block sizes (either 8 or 16MiB), it just isn't in the spec.

Back when I pushed it, I had done some small-scale tests which reflected a negligible startup difference but a 5-10% shrink in size. Unfortunately that was a while ago and I don't even remember what AppImages I tested it on. Have you tested with other compression algorithms? I vaguely remember ZSTD having bad performance in some tests

Samueru-sama commented 1 month ago

@mgord9518 I didn't test other algos and it seems like zstd is now the only available algo in this appimagetool, xz and gzip error out.

probonopd commented 1 month ago

Yes, we want to standardize on zstandard. Unless someone has hard proof that something else would be better in most cases.

Samueru-sama commented 1 month ago

Yes, we want to standardize on zstandard. Unless someone has hard proof that something else would be better in most cases.

Well with the current block size gzip is a demonstrably better algo, even when using zstd-1 which makes the appimage bigger it still takes longer to start than gzip.

probonopd commented 1 month ago

Thanks @TheAssassin for looking into this.

Someone imho still should systematically(!) test all of this (= compression algorithm, compression factor, block size), in terms of

so that we can choose the best tradeoff.

And then we should publish at least a recommendation in AppImageSpec.

mgord9518 commented 1 month ago

It might be worth noting that using libdeflate has significantly faster (2x) results than zlib on zlib-compressed SquashFS files. I currently have it implemented in my squashfuse-zig library, so it might be a good option to backport to the original squashfuse project if zstd shows to have bad performance.

probonopd commented 1 month ago

Interesting! One additional candidate to benchmark.

Samueru-sama commented 1 month ago

I think you also need to consider using dwarfs instead of squashfs. But for that there is an issue opened already.

I will add that now that the block size is the default size with zstd, and the compression level is set very low instead of the default level 15, the speed it takes for applications to open is very similar to the speed it would take a native application, like about 0.6 second difference a most for web browsers compared to the native package, and it is likely less on newer hardware, I have a broadwell CPU which is old.

probonopd commented 1 month ago

Indeed. dwarfs, too.

Someone really needs to systematically test all this stuff.

Preferably in a reproducible setup using GitHub Actions.

mgord9518 commented 1 month ago

I've messed with DwarFS it has some serious issues for use in AppImage

1: It's licensed GPL3, so the AppImage itself would have to be distributed under a compatible license 2: The binaries are quite large (I think 8MB uncompressed, ~3 when compressed with Zopfli), so they're only beneficial for quite large applications.

Don't get me wrong it's an incredible format and the compression ratios blow SquashFS out of the water, but it would only be able to benefit some niche scenarios like FreeCAD or SuperTuxKart: GPL3 applications that are massive enough to benefit from the high compression

probonopd commented 1 month ago

Indeed, GPL3 is not what we are looking for, and the size overhead also seems very high. I am intrigued about using zopfli to reduce the size of the AppImage runtime though...

mgord9518 commented 1 month ago

It's something I'm doing in my shappimage project, but for native executables something like UPX might make more sense