AppImage / AppImageSpec

This repository holds the specification for the AppImage format.
http://appimage.org/
MIT License
70 stars 22 forks source link

[Feature] Integrity verification #35

Open iTrooz opened 2 years ago

iTrooz commented 2 years ago

AppImages are files which can get really big in size, and download errors can occur with such sizes I propose to add some way to (at least try to) verify it's integrity at launch

Why ? As of now, a partially downloaded AppImage will fail to launch with

Something went wrong trying to read the squashfs image.

Cannot mount AppImage, please check your FUSE setup.
You might still be able to extract the contents of this AppImage 
if you run it with the --appimage-extract option. 
See https://github.com/AppImage/AppImageKit/wiki/FUSE 
for more information
open dir error: No such file or directory

It is maybe harmless in this case, but I think it could potentially cause problems if the partially downloaded AppImage seems like a legit squashfs image. (This is only a theory, I do not have any proof this can happen)

Implementation propositions :

I think this feature should be looked at before https://github.com/AppImage/AppImageSpec/issues/34 (because well, it would modify the spec)

piegamesde commented 2 years ago

There is no good reason to use md5 ever again (except for PoW and other hash collision games lol). I suggest using one of the more modern hashing algorithms that are both fast and decently secure.

probonopd commented 2 years ago

verify the md5sum of the AppImage on launch

Would be way too slow to do this on every launch.

probonopd commented 2 years ago

I suggest using one of the more modern hashing algorithms

Thanks for the suggestion. Please open a separate issue for this.

piegamesde commented 2 years ago

I suggest running openssl speed md5 sha256 (feel free to do this for any other hashing algorithms you are curious about). On my not-so-new laptop, I get:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
md5             144935.20k   320481.47k   551855.87k   671047.00k   717589.16k   721843.97k
sha3-256         32491.03k   131696.02k   293458.26k   334464.34k   373675.75k   365772.80k
sha256           77191.26k   187495.96k   346574.08k   428391.26k   462484.82k   464044.03k
blake2s256       65362.19k   270149.67k   409836.05k   464706.90k   506539.58k   509957.46k

That's roughly between 300MB and 700MB per second on a single core (the last column is the one that matters for our use case). For reference, my SATA SSD benchmarks at 550MB/s read speed. It totally is possible to build integrity checks without noticeably slowing down the startup time on most devices and for most binary sizes.

probonopd commented 2 years ago

Yes, with a 5 GB AppImage (like a game) it would be multiple seconds, which is too much. And think spinning disks, etc.

TheAssassin commented 2 years ago

We could expose/implement such a function in libappimage, so tools like AppImageLauncher could provide UIs that support this. Please open an issue there.

iTrooz commented 2 years ago

Done !

We could expose/implement such a function in libappimage

Wouldn't it need support from the AppImage spec, to integrate the checksum in the file ?

I would like to point out that most of the comments here are talking about MD5, when a more viable solution would be CRC (isn't it literally made for this use case ?) https://en.wikipedia.org/wiki/Cyclic_redundancy_check https://gist.github.com/zhiyelee/35b12cc436ed1a8334ce

piegamesde commented 2 years ago

Yes, with a 5 GB AppImage (like a game) it would be multiple seconds, which is too much. And think spinning disks, etc.

Any file this size will take seconds to load, because disk drives are slow. As long as the CPU is faster at hashing than the drive can load the bytes, the load time overhead will be negligible.

TheAssassin commented 2 years ago

Wouldn't it need support from the AppImage spec, to integrate the checksum in the file ?

libappimage respectively AppImageUpdate already implicitly define those hashing algorithms, however a formal specification is not available as of yet. This is subject to changes in the future. Nevertheless, it makes sense to implement this in libappimage. We need to reopen this issue therefore.

the load time overhead will be negligible

Agreed. Still, depending on the underlying disk's type (slow 2.5" HDD for instance), it'll introduce a severe performance overhead.

For small AppImages, the "run always" argument may even end up in wasting battery life for no good reason. There are AppImages out there that run quite frequently, or are launched multiple times in parallel.

There is no need to running this every time. Such a check makes sense once. Bit errors after downloading (e.g., due to disk errors) are not a valid argument, as they may affect any file all the time. Implementing a "first launch" style test is a perfect use case for AppImageLauncher and similar tools, which hook into first launches. I could imagine, e.g., providing a "check integrity" button in the first launch dialog.

iTrooz commented 2 years ago

You're right, running this on every launch would be problematic for some hardware

I could imagine, e.g., providing a "check integrity" button in the first launch dialog.

Why not do it automatically (with maybe an option to disable it in the settings) ? I can easily imagine users skipping this check when integrating AppImages

TheAssassin commented 2 years ago

That's a discussion to have in AppImageLauncher's issue tracker.

iTrooz commented 2 years ago

:+1:

So to conclude, what should be done in this repository ? Mention the hashs used and embedded in the AppImages ?